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PREFACE 

This textbook is an expanded version of Elementary Linear Algebra, Ninth Edition, by Howard Anton. The first ten chapters of 
this book are identical to the first ten chapters of that text; the eleventh chapter consists of 21 applications of linear algebra 
drawn from business, economics, engineering, physics, computer science, approximation theory, ecology, sociology, 
demography, and genetics. The applications are, with one exception, independent of one another and each comes with a list of 
mathematical prerequisites. Thus, each instructor has the flexibility to choose those applications that are suitable for his or her 
students and to incorporate each application anywhere in the course after the mathematical prerequisites have been satisfied. 

This edition of Elementary Linear Algebra, like those that have preceded it, gives an elementary treatment of linear algebra that 
is suitable for students in their freshman or sophomore year. The aim is to present the fundamentals of linear algebra in the 
clearest possibleway; pedagogy is the main consideration. Calculus is not a prerequisite, but there are clearly labeled exercises 
and examples for students who have studied calculus. Those exercises can be omitted without loss of continuity. Technology is 
also not required, but for those who would like to use MATLAB, Maple, Mathematica, or calculators with linear algebra 
capabilities, exercises have been included at the ends of the chapters that allow for further exploration of that chapter's 
contents. 



SUMMARY OF CHANGES 
IN THIS EDITION 

This edition contains organizational changes and additional material suggested by users of the text. Most of the text is 
unchanged. The entire text has been reviewed for accuracy, typographical errors, and areas where the exposition could be 
improved or additional examples are needed. The following changes have been made: 

* 
Section 6.5 has been split into two sections: Section 6.5 Change of Basis and Section 6.6 Orthogonal Matrices. This 
allows for sharper focus on each topic. 

* 
A new Section 4.4 Spaces of Polynomials has been added to further smooth the transition to general linear 
transformations, and a new Section 8.6 Isomorphisms has been added to provide explicit coverage of this topic. 

■* 
Chapter 2 has been reorganized by switching Section 2.1 with Section 2.4. The cofactor expansion approach to 
determinants is now covered first and the combinatorial approach is now at the end of the chapter. 

* 
Additional exercises, including Discussion and Discovery, Supplementary, and Technology exercises, have been added 
throughout the text. 

* 
In response to instructors' requests, the number of exercises that have answers in the back of the book has been reduced 
considerably. 

■* 
The page design has been modified to enhance the readability of the text. 

* 
A new section on the earliest applications of linear algebra has been added to Chapter 1 1 . This section shows how linear 
equations were used to solve practical problems in ancient Egypt, Babylonia, Greece, China, and India. 



Hallmark Features 

Relationships Between Concepts One of the important goals of a course in linear algebra is to establish the intricate 
thread of relationships between systems of linear equations, matrices, determinants, vectors, linear transformations, and 
eigenvalues. That thread of relationships is developed through the following crescendo of theorems that link each new 
idea with ideas that preceded it: 1.5.3, 1.6.4, 2.3.6, 4.3.4, 5.6.9, 6.2.7, 6.4.5, 7.1.5. These theorems bring a coherence to 
the linear algebra landscape and also serve as a constant source of review. 

m 

Smooth Transition to Abstraction The transition from R* 2 to general vector spaces is often difficult for students. To 
smooth out that transition, the underlying geometry of R* 2 is emphasized and key ideas are developed in R* 2 before 
proceeding to general vector spaces. 

■* 
Early Exposure to Linear Transformations and Eigenvalues To ensure that the material on linear transformations 
and eigenvalues does not get lost at the end of the course, some of the basic concepts relating to those topics are 
developed early in the text and then reviewed and expanded on when the topic is treated in more depth later in the text. 
For example, characteristic equations are discussed briefly in the chapter on determinants, and linear transformations from 
R* 2 to R™ are discussed immediately after R* 2 is introduced, then reviewed later in the context of general linear 
transformations. 



About the Exercises 

Each section exercise set begins with routine drill problems, progresses to problems with more substance, and concludes with 
theoretical problems. In most sections, the main part of the exercise set is followed by the Discussion and Discovery problems 
described above. Most chapters end with a set of supplementary exercises that tend to be more challenging and force the 
student to draw on ideas from the entire chapter rather than a specific section. The technology exercises follow the 
supplementary exercises and are classified according to the section in which we suggest that they be assigned. Data for these 
exercises in MATLAB, Maple, and Mathematica formats can be downloaded from www.wiley.com/college/anton . 

About Chapter 11 

This chapter consists of 21 applications of linear algebra. With one clearly marked exception, each application is in its own 
independent section, so that sections can be deleted or permuted freely to fit individual needs and interests. Each topic begins 
with a list of linear algebra prerequisites so that a reader can tell in advance if he or she has sufficient background to read the 
section. 

Because the topics vary considerably in difficulty, we have included a subjective rating of each topic — easy, moderate, more 
difficult. (See "A Guide for the Instructor" following this preface.) Our evaluation is based more on the intrinsic difficulty of 
the material rather than the number of prerequisites; thus, a topic requiring fewer mathematical prerequisites may be rated 
harder than one requiring more prerequisites. 

Because our primary objective is to present applications of linear algebra, proofs are often omitted. We assume that the reader 
has met the linear algebra prerequisites and whenever results from other fields are needed, they are stated precisely (with 
motivation where possible), but usually without proof. 

Since there is more material in this book than can be covered in a one-semester or one-quarter course, the instructor will have 
to make a selection of topics. Help in making this selection is provided in the Guide for the Instructor below. 

Supplementary Materials for Students 

Student Solutions Manual, Ninth Edition — This supplement provides detailed solutions to most theoretical exercises and to at 
least one nonroutine exercise of every type. (ISBN 0-471-43329-2) 



Data for Technology Exercises is provided in MATLAB, Maple, and Mathematica formats. This data can be downloaded from 
www.wiley.com/college/anton . 

Linear Algebra Solutions — Powered by JustAsk! invites you to be a part of the solution as it walks you step-by-step through a 
total of over 150 problems that correlate to chapter materials to help you master key ideas. The powerful online 
problem- solving tool provides you with more than just the answers. 

Supplementary Materials for Instructors 

Instructor's Solutions Manual — This new supplement provides solutions to all exercises in the text. (ISBN 0-471-44798-6) 

Test Bank — This includes approximately 50 free-form questions, five essay questions for each chapter, and a sample 
cumulative final examination. (ISBN 0-471-44797-8) 

eGrade — eGrade is an online assessment system that contains a large bank of skill-building problems, homework problems, 
and solutions. Instructors can automate the process of assigning, delivering, grading, and routing all kinds of homework, 
quizzes, and tests while providing students with immediate scoring and feedback on their work. Wiley eGrade "does the 
math". . . and much more. For more information, visit http://www.wiley.com/college/egrade or contact your Wiley 
representative. 

Web Resources — More information about this text and its resources can be obtained from your Wiley representative or from 
www.wiley.com/college/anton . 



A GUIDE FOR THE 
INSTRUCTOR 



Linear algebra courses vary widely between institutions in content and philosophy, but most courses fall into two categories: 
those with about 35-40 lectures (excluding tests and reviews) and those with about 25-30 lectures (excluding tests and 
reviews). Accordingly, I have created long and short templates as possible starting points for constructing a course outline. In 
the long template I have assumed that all sections in the indicated chapters are covered, and in the short template I have 
assumed that instructors will make selections from the chapters to fit the available time. Of course, these are just guides and 
you may want to customize them to fit your local interests and requirements. 

The organization of the text has been carefully designed to make life easier for instructors working under time constraints: A 
brief introduction to eigenvalues and eigenvectors occurs in Sections 2.3 and 4.3, and linear transformations from R n to R™ are 
discussed in Chapter 4. This makes it possible for all instructors to cover these topics at a basic level when the time available 
for their more extensive coverage in Chapters 7 and 8 is limited. Also, note that Chapter 3 can be omitted without loss of 
continuity for students who are already familiar with the material. 





Long Template 


Short Template 


Chapter 1 


7 lectures 


6 lectures 


Chapter 2 


4 lectures 


3 lectures 


Chapter 4 


4 lectures 


4 lectures 


Chapter 5 


7 lectures 


6 lectures 


Chapter 6 


6 lectures 


3 lectures 



Long Template Short Template 



Chapter 7 4 lectures 

Chapter 8 6 lectures 



Total 



38 lectures 



3 lectures 



2 lectures 



27 lectures 



Variations in the Standard Course 

Many variations in the long template are possible. For example, one might create an alternative long template by following the 
time allocations in the short template and devoting the remaining 11 lectures to some of the topics in Chapters 9, 10 and 11. 

An Applications-Oriented Course 

Once the necessary core material is covered, the instructor can choose applications from Chapter 9 or Chapter 11. The 
following table classifies each of the 21 sections in Chapter 11 according to difficulty: 

Easy. The average student who has met the stated prerequisites should be able to read the material with no help from the 
instructor. 



Moderate. The average student who has met the stated prerequisites may require a little help from the instructor. 
More Difficult. The average student who has met the stated prerequisites will probably need help from the instructor. 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 
EASY • • 



MODERATE 



MORE 
DIFFICULT 
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CHAPTER 



Systems of Linear Equations and Matrices 



INTRODUCTION: Information in science and mathematics is often organized into rows and columns to form rectangular arrays, 
called "matrices" (plural of "matrix"). Matrices are often tables of numerical data that arise from physical observations, but they also 
occur in various mathematical contexts. For example, we shall see in this chapter that to solve a system of equations such as 

5x+y = 3 

2x-y=4 

all of the information required for the solution is embodied in the matrix 

"5 1 3 
2 -1 4 

and that the solution can be obtained by performing appropriate operations on this matrix. This is particularly important in 
developing computer programs to solve systems of linear equations because computers are well suited for manipulating arrays of 
numerical information. However, matrices are not simply a notational tool for solving systems of equations; they can be viewed as 
mathematical objects in their own right, and there is a rich and important theory associated with them that has a wide variety of 
applications. In this chapter we will begin the study of matrices. 
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1 ■ 1 Systems of linear algebraic equations and their solutions constitute one of the 

INTRODUCTION TO major topics studied in the course known as "linear algebra." In this first section 

- _ mc OF ITNFAR we shal1 ' ntrocluce some basic terminology and discuss a method for solving such 

o y o l "/ / /^ . 

EQUATIONS 



Linear Equations 

Any straight line in the *y-plane can be represented algebraically by an equation of the form 

a\x -\-a^y = b 

where a\, a2> and b are real constants and a\ and ^2 are not both zero. An equation of this form is called a linear equation in the 
variables x and y. More generally, we define a linear equation in the n variables x\ 9 X2> • • •» x n t0 ^ e one ^ at can ^ e expressed in the 
form 

a\x\ H-i32^2 H V a n x n =b 

where a\, a2, • • ., a n , and b are real constants. The variables in a linear equation are sometimes called unknowns. 



EXAMPLE 1 Linear Equations 



The equations 



* + 3^ = 7, y = ^-x + 3z-h 1, and x\ — 2*2 — 3*3 I *4 = 7 



are linear. Observe that a linear equation does not involve any products or roots of variables. All variables occur only to the first 
power and do not appear as arguments for trigonometric, logarithmic, or exponential functions. The equations 

* + 3^y = 5, 3x -\-2y —z -\- xz = 4, and y = sin* 

are not linear. 

A solution of a linear equation a \x\ I ^2*2 ^ 1- a n ?: n =bis a sequence of n numbers s\ 9 S2> • • •> s n such that the equation is 

satisfied when we substitute x\ = s\, *2 — s 2> • • •> x n — s n- The set of all solutions of the equation is called its solution set or 
sometimes the general solution of the equation. 



EXAMPLE 2 Finding a Solution Set 



Find the solution set of (a) 4^ _ 2y = 1, and (b) x ^ _ 4^2 I 7*3 = 5- 

Solution (a) 

To find solutions of (a), we can assign an arbitrary value to x and solve for y, or choose an arbitrary value for y and solve for x. If 
we follow the first approach and assign x an arbitrary value t, we obtain 

X = £, y = 2£-± 

These formulas describe the solution set in terms of an arbitrary number t, called a parameter. Particular numerical solutions can be 



obtained by substituting specific values for t. For example, t = 3 yields the solution x = 3, y = 4rS and t = — -jr yields the solution 
If we follow the second approach and assign y the arbitrary value t, we obtain 

* = ^ + h y=t 



Although these formulas are different from those obtained above, they yield the same solution set as t varies over all possible real 

rev 
11 



numbers. For example, the previous formulas gave the solution x = 3, y = 4^- when t — 3, whereas the formulas immediately above 



yield that solution when t= r . § . 

Solution (b) 

To find the solution set of (b), we can assign arbitrary values to any two variables and solve for the third variable. In particular, if 
we assign arbitrary values s and tto X2 and x^ respectively, and solve for x\, we obtain 

*1 = 5 -h 4s — It, X2 = s > X3 = £ 

Linear Systems 

A finite set of linear equations in the variables x\,X2> • • •' *h is called a system of linear equations or a linear system. A sequence 
of numbers s\, S2, • • ., s n is called a solution of the system if x\ = s\, X2 = *2> • • •> *h — s n * s a solution of every equation in the 
system. For example, the system 

4*1 —X2 + 3^3 — — ^ 
3jci +^2 + 9x2= -4 

has the solution x ^ — ], ^ = 2> x^ = — 1 since these values satisfy both equations. However, x ^ — ], ^ = 8> X3 = 1 is not a 
solution since these values satisfy only the first equation in the system. 

Not all systems of linear equations have solutions. For example, if we multiply the second equation of the system 

*+ 7 = 4 
2x + 2y = 6 

by Tj-, it becomes evident that there are no solutions since the resulting equivalent system 

*+7=4 

has contradictory equations. 

A system of equations that has no solutions is said to be inconsistent; if there is at least one solution of the system, it is called 
consistent. To illustrate the possibilities that can occur in solving systems of linear equations, consider a general system of two 
linear equations in the unknowns x and 3;: 

a\x + b\y = c\ (a\, b\ not both zero) 
(32* + ^2y — ^2 ( fl 2> *2 not both zero) 

The graphs of these equations are lines; call them / 1 and / 2 - Since a point (x, y) lies on a line if and only if the numbers x and y 
satisfy the equation of the line, the solutions of the system of equations correspond to points of intersection of / 1 and / 2 - There are 
three possibilities, illustrated in Figure 1.1.1: 



l ; 




(a) No solution 




(A) One solution 



L and L 




[€) Infinitdy many solutions 
Figure 1.1.1 

The lines / 1 and / 2 may be parallel, in which case there is no intersection and consequently no solution to the system. 

The lines / 1 and / 2 m ay intersect at only one point, in which case the system has exactly one solution. 

The lines / 1 and / 2 m ay coincide, in which case there are infinitely many points of intersection and consequently infinitely 
many solutions to the system. 



Although we have considered only two equations with two unknowns here, we will show later that the same three possibilities hold 
for arbitrary linear systems: 



Every system of linear equations has no solutions, or has exactly one solution, or has infinitely many solutions. 



An arbitrary system of m linear equations in n unknowns can be written as 

ail*! +^12*2 +--I a\nXn = *1 
fl21*l + ^22*2 H \ ^2h*h = b 2 

a m \x\ +a m 2*2H \- a mnXn=bm 



where x\, x% . . ., * H are the unknowns and the subscripted a's and Z/s denote constants. For example, a general system of three 
linear equations in four unknowns can be written as 

flll*l 4- ^12*2 I ^13*3 | a\4X4 = b\ 

^21*1 + ^22*2 I ^23^3 I ^24^4 = ^2 

fl31*l + tf32*2 I fl 33*3 I ^34*4 = ^3 



The double subscripting on the coefficients of the unknowns is a useful device that is used to specify the location of the coefficient 
in the system. The first subscript on the coefficient fly indicates the equation in which the coefficient occurs, and the second 
subscript indicates which unknown it multiplies. Thus, a 12 is in the first equation and multiplies unknown * 2 . 

Augmented Matrices 

If we mentally keep track of the location of the +'s, the x's, and the ='s, a system of m linear equations in n unknowns can be 
abbreviated by writing only the rectangular array of numbers: 

An An - fliw *i 

fl2l fl22 - ^2h ^2 
fl ml a m2 L,J a nm ®m 



This is called the augmented matrix for the system. (The term matrix is used in mathematics to denote a rectangular array of 
numbers. Matrices arise in many contexts, which we will consider in more detail in later sections.) For example, the augmented 
matrix for the system of equations 

*1 4- X2 I 2*3 = 9 

2x\ I 4*2 — 3*3 = 1 

3*i + 6*2 — 5*3 = 

is 



1 1 


2 9" 


2 4 


-3 1 


3 6 


-5 



Remark When constructing an augmented matrix, we must write the unknowns in the same order in each equation, and the 
constants must be on the right. 

The basic method for solving a system of linear equations is to replace the given system by a new system that has the same solution 
set but is easier to solve. This new system is generally obtained in a series of steps by applying the following three types of 
operations to eliminate unknowns systematically: 

1. Multiply an equation through by a nonzero constant. 



2. Interchange two equations. 



3. Add a multiple of one equation to another. 



Since the rows (horizontal lines) of an augmented matrix correspond to the equations in the associated system, these three 
operations correspond to the following operations on the rows of the augmented matrix: 



1. Multiply a row through by a nonzero constant. 



2. Interchange two rows. 



3. Add a multiple of one row to another row. 



Elementary Row Operations 

These are called elementary row operations. The following example illustrates how these operations can be used to solve systems 
of linear equations. Since a systematic procedure for finding solutions will be derived in the next section, it is not necessary to 
worry about how the steps in this example were selected. The main effort at this time should be devoted to understanding the 
computations and the discussion. 



EXAMPLE 3 Using Elementary Row Operations 

In the left column below we solve a system of linear equations by operating on the equations in the system, and in the right column 
we solve the same system by operating on the rows of the augmented matrix. 



x + y I 2z = 9 
2x ^ 4y-3z=\ 
3x-\ 6y-5z = 



11 2 9 

2 4-31 

3 6-50 



Add -2 times the first equation to the second to obtain 



Add -2 times the first row to the second to obtain 



x + y + 2z = 9 

2y-lz= -17 

3x 4 6y - 5z = 



112 9 

2 -7 -17 
3 6-5 



Add -3 times the first equation to the third to obtain 



Add -3 times the first row to the third to obtain 



x+ y+ 2z= 9 
2y- lz= -17 
3y-llz= -27 



11 2 9 

2 -7-17 
3 -11 -27 



Multiply the second equation by -^ to obtain 



Multiply the second row by -^ to obtain 



x + y + 2z = 



7 17 

2 Z= -T 

3y-\\z = -27 



y ~ 2 Z= -T 



"l 1 


2 


9 


1 


7 
2 


17 
2 


3 


-11 


-27 



Add -3 times the second equation to the third to obtain 



Add -3 times the second row to the third to obtain 



x -\-y + 2z = 
7. 



'-2—? 
-¥- "I 



1 1 

1 

0-4- -^ 



2 


9 


7 
2 


17 
2 


1 
2 


3 
2 



Multiply the third equation by - 2 to obtain 



Multiply the third row by -2 to obtain 



x + y + 2z = 



y ~2 Z = 



z = 



9 

XL 

2 
3 



1 1 
1 




2 


9 


7 
2 


17 
2 


1 


3 



Add -1 times the second equation to the first to obtain 



Add -1 times the second row to the first to obtain 



+^= 


35 
2 


'-!-= 


17 
2 


z = 


3 



10 f 

1 



35 
2 

11 
2 

3 



1 1 7 

Add — 4r times the third equation to the first and -jr times the 

third equation to the second to obtain 



1 1 7 

Add — -77- times the third row to the first and -J- times the 

2 2 

third row to the second to obtain 



= 1 

y =2 

z=3 



The solution x = \,y = 2,z = 3^ now evident. 



1 








f 





1 





2 








1 


3 



Exercise Set 1.1 



O 



Click here for Just Ask! 



Which of the following are linear equations in x\, xj, and x-±! 



( a > x x I 5*2-|/2*3 = l 



(b) ;q I 3*2 I *i*3 = 2 



(c) xi = -7*2 I 3*3 



(d) *r 2 +*2 + 8*^ = 5 



^ *P-2* 2 I * 3 =4 



(f) ^*i-/2*2 I ^*3=7 1/3 



2. 



Given that k is a constant, which of the following are linear equations? 



3. 



4. 



5. 



(a) x\ — X2 4- *3 = sink 



^-1*2 = 9 



(c) 2*xi I 7x 2 -X3 = 
Find the solution set of each of the following linear equations. 

(a) 7* -5^ = 3 

(b) 3^1-5^2 I 4x 3 = 7 

(c) -8xi I 2x2-5x3 I 6x4=1 

(d) 3v-8w \ 2x-y \ 4z = 

Find the augmented matrix for each of the following systems of linear equations. 

(a) 3xi -2x 2 = - 1 
4xi I 5x2= 3 
7xi + 3x2 — 2 

(b) 2xi +2x3 = 1 
3xi — *2 + 4x3 — ^ 
6x1 ^2- X3 = 

(c) xi + 2x2 — X4+x^=l 

3x2 + X3 — x^ = 2 

X3 + 7x4 — 1 

(d) *! = 1 

X2 =2 
x 3 = 3 

Find a system of linear equations corresponding to the augmented matrix. 



(a) 



2 


0" 


3 


-4 





1 1 



(b) 



3 0-25 
7 14-3 
0-217 



(c) 



7 2 1 -3 5 
12 4 1 



(d) 



10 
10 
10 
1 



7 

-2 

3 

4 



(a) Find a linear equation in the variables x and y that has the general solution x = 5 + 2t, y = t- 



(b) Show that * = 2, y = )-t — ^ is also the general solution of the equation in part (a). 



The curve y = ax + bx -\- c shown in the accompanying figure passes through the points (x\,y\), (x2, J2)' anc * (xj, yi)- Show 
that the coefficients a, b, and c are a solution of the system of linear equations whose augmented matrix is 



4 


*i 1 y\ 


4 


*2 1 72 


A 


*3 1 73 









Figure Ex-7 



8. 



Consider the system of equations 

x -\-y 4- 2z = a 

x + z = b 

2x + y + 3z = c 

Show that for this system to be consistent, the constants a, b, and c must satisfy c = a-\-h- 



Show that if the linear equations X \ + for 2 = c and x\ Ylx2—d have the same solution set, then the equations are identical. 
9. 

Show that the elementary row operations do not affect the solution set of a linear system. 
10. 



Discussion 

For which value(s) of the constant k does the system 

XX. J-, 

x- y = 3 



2x — 2y = k 
have no solutions? Exactly one solution? Infinitely many solutions? Explain your reasoning. 

Consider the system of equations 

ax + by = k 
ex -\-dy = 1 
ex+fy=m 

Indicate what we can say about the relative positions of the lines ax -\-by = k> ex I dy = i, and 
ex f fy = m when 

(a) the system has no solutions. 

(b) the system has exactly one solution. 

(c) the system has infinitely many solutions. 



If the system of equations in Exercise 12 is consistent, explain why at least one equation can be 
13- discarded from the system without altering the solution set. 

If k = I = m = 0^ n Exercise 12, explain why the system must be consistent. What can be said about 

14. the point of intersection of the three lines if the system has exactly one solution? 

We could also define elementary column operations in analogy with the elementary row operations. 

15. What can you say about the effect of elementary column operations on the solution set of a linear 
system? How would you interpret the effects of elementary column operations? 
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1a In this section we shall develop a systematic procedure for solving systems of 

■^ linear equations. The procedure is based on the idea of reducing the augmented 

GAUSSIAN ELIMINATION matrix of a system to another augmented matrix that is simple enough that the 

solution of the system can be found by inspection. 



Echelon Forms 

In Example 3 of the last section, we solved a linear system in the unknowns x, y, and z by reducing the augmented matrix to the 
form 



1 








f 





1 





2 








1 


3 



from which the solution x=\, y = 2,z = 3 became evident. This is an example of a matrix that is in reduced row-echelon form. To 
be of this form, a matrix must have the following properties: 

1. If a row does not consist entirely of zeros, then the first nonzero number in the row is a 1. We call this a leading 1. 
If there are any rows that consist entirely of zeros, then they are grouped together at the bottom of the matrix. 

3. In any two successive rows that do not consist entirely of zeros, the leading 1 in the lower row occurs farther to the right than 
the leading 1 in the higher row. 

4. Each column that contains a leading 1 has zeros everywhere else in that column. 

A matrix that has the first three properties is said to be in row-echelon form. (Thus, a matrix in reduced row-echelon form is of 
necessity in row-echelon form, but not conversely.) 



EXAMPLE 1 Row-Echelon and Reduced Row-Echelon Form 



The following matrices are in reduced row-echelon form. 



1 









1 


4" 




"l 0" 




1 


7 


7 


1 


7 


1 


-1 




1 





-2 f 






1 3 




"0 0" 





? 













The following matrices are in row-echelon form. 



12 6 
1 -10 
1 

We leave it for you to confirm that each of the matrices in this example satisfies all of the requirements for its stated form. 



1 4 


-3 7" 




"1 1 0" 




1 


6 2 


, 


1 


, 





1 5 










EXAMPLE 2 More on Row-Echelon and Reduced Row-Echelon Form 



As the last example illustrates, a matrix in row-echelon form has zeros below each leading 1, whereas a matrix in reduced 
row-echelon form has zeros below and above each leading 1. Thus, with any real numbers substituted for the *'s, all matrices of the 
following types are in row-echelon form: 

I # * 

1 * 
1 


* Dfc Dfc 

* * * 

* * * 

D(C # # 

1 * 



1 


* 


* 


* 





1 


* 


* 



























1 * 


* 


* 








1 


* 


* 











1 


* 


' 





















1 


* 


:4c 


* 


* 


* 








1 


* 


* 


* 











1 


* 


* 














1 


* 





















Moreover, all matrices of the following types are in reduced row-echelon form: 



1 o o" 




10 




10 


' 


1 





1 


* 


* 





1 * 


* 





















1 


*~ 


1 


* 





1 * 









1 


* 











* 


* 











1 








* 


* 














1 





* 


* 

















1 


* 


* 


























1 



If, by a sequence of elementary row operations, the augmented matrix for a system of linear equations is put in reduced row-echelon 
form, then the solution set of the system will be evident by inspection or after a few simple steps. The next example illustrates this 
situation. 



EXAMPLE 3 Solutions of Four Linear Systems 



Suppose that the augmented matrix for a system of linear equations has been reduced by row operations to the given reduced 
row-echelon form. Solve the system. 



(a) 



1 


5" 


1 


-2 


1 


4 



(b) 



10 4-1 
10 2 6 
13 2 



(c) 



16 4 

10 3 

15 





-2 
1 
2 




(d) 



1 








0" 





1 


2 














1 



Solution (a) 

The corresponding system of equations is 



*1 



5 

*2 = -2 
*3 = 4 

By inspection, X{ = 5, X2 = _ 2, * 3 = 4- 

Solution (b) 

The corresponding system of equations is 

x { +4*4= -1 

*2 +2*4= 6 

*3 + 3*4 = 2 

Since x\, *2, an d *3 correspond to leading l's in the augmented matrix, we call them leading variables or pivots. The nonleading 
variables (in this case 7:4) are called free variables. Solving for the leading variables in terms of the free variable gives 

*l = — 1 —4*4 
*2 = 6 — 2*4 
*3 = 2 — 3*4 

From this form of the equations we see that the free variable *4 can be assigned an arbitrary value, say t, which then determines the 
values of the leading variables *i, *2, and *^. Thus there are infinitely many solutions, and the general solution is given by the 
formulas 



*l=-l-4£, *2 = 6-2£, *3 = 2-3*, 



*4 = £ 



Solution (c) 



The row of zeros leads to the equation Q x ^ + 0*2 + 0*3 + 0*4 + 0*5 = 0> which places no restrictions on the solutions (why?). 
Thus, we can omit this equation and write the corresponding system as 

*l + 6*2 +4*^= —2 

*3 +3*5= 1 

*4 I 5*5 = 2 

Here the leading variables are x\, *3, and *4, and the free variables are *2 and *^. Solving for the leading variables in terms of the 
free variables gives 

*l = — 2 — 6*2 — 4*5 

*3 = 1 — 3*5 

*4 = 2 — 5*5 

Since * 5 can be assigned an arbitrary value, t, and *2 can be assigned an arbitrary value, s, there are infinitely many solutions. The 
general solution is given by the formulas 

*1 = — 2 — 6s — 4t, x?_ = s, x^=\ — 3t, *4 = 2 — 5t, x* } = t 



Solution (d) 

The last equation in the corresponding system of equations is 

0;q +0*2 + 0*3= 1 

Since this equation cannot be satisfied, there is no solution to the system. 

4 

Elimination Methods 

We have just seen how easy it is to solve a system of linear equations once its augmented matrix is in reduced row-echelon form. 
Now we shall give a step-by-step elimination procedure that can be used to reduce any matrix to reduced row-echelon form. As we 
state each step in the procedure, we shall illustrate the idea by reducing the following matrix to reduced row-echelon form. 



0-20 7 12 
2 4 -10 6 12 28 
2 4 -5 6-5-1 



Step 1. Locate the leftmost column that does not consist entirely of zeros. 



0-20 7 12 
2 4 -10 6 12 23 
2 4 -5 6-5-1 

L- Leftmost nonzero c ohmm 



Step 2. Interchange the top row with another row, if necessary, to bring a nonzero entry to the top of the column found in Step 1. 



2 4 -10 6 12 28 
0-20 7 12 
2 4 -5 6-5-1 



The first and second rows in the 
preceding matrix were interchanged. 



Step 3. If the entry that is now at the top of the column found in Step 1 is a, multiply the first row by II a in order to introduce a 
leading 1. 



12-53 6 14 
0-20 7 12 
2 4-56-5-1 



The first row of the preceding 
matrix was multiplied by i-. 



Step 4. Add suitable multiples of the top row to the rows below so that all entries below the leading 1 become zeros. 



12-53 6 14 

0-20 7 12 

5 -17 -29 



— 2 times the first row of the preceding 
matrix was added to the third row. 



Step 5. Now cover the top row in the matrix and begin again with Step 1 applied to the submatrix that remains. Continue in this 
way until the entire matrix is in row-echelon form. 



"l 2 


-5 3 


6 


14" 






-2 
5 


7 
-17 


12 
-29 



1— Leftmost nonzero cohmm in the submatrix 



"l 2 


-5 3 


6 


14" 





1 


7 
2 


-6 


_0 


5 


-17 


-29_ 



1 2 


-5 3 


6 


14 





1 


7 
2 


-6 








1 
2 


1 



1 2 


-5 3 


6 


14 





1 


7 
2 


-6 








2 


1 



The first row in the submatrix was 
multiplied by — ^ to introduce a leading 1. 



^ 5 times the first row of the submatrk 

was added to the second row of the 
submatrix to introduce a zero 
below the leading 1. 

*— The top row in the submatrix was 

covered, and we returned again to Step 1. 



1 2 


-5 3 


6 


14 





1 


7 
2 


-6 


_0 





1 


2_ 



I— Leftmost nonzero cohmm in the new snbmatiix 

*— The first (and only) row in the new 



submatrix was multiplied by 2 
to introduce a leading 1. 

The entire matrix is now in row-echelon form. To find the reduced row-echelon form we need the following additional step. 



Step 6. Beginning with the last nonzero row and working upward, add suitable multiples of each row to the rows above to introduce 
zeros above the leading l's. 



1 2 -5 3 6 14 

10 1 

12 

12-5302 

10 1 

12 

12 3 7^ 

10 1 

12 



— times the third row of the preceding 
matrix was added to the second row. 



— 6 times the third row was 
added to the first row. 

5 times the second row was 
added to the first row. 



The last matrix is in reduced row-echelon form. 

If we use only the first five steps, the above procedure produces a row-echelon form and is called Gaussian elimination. Carrying 
the procedure through to the sixth step and producing a matrix in reduced row-echelon form is called Gauss-Jordan elimination. 



Remark It can be shown that every matrix has a unique reduced row-echelon form; that is, one will arrive at the same reduced 
row-echelon form for a given matrix no matter how the row operations are varied. (A proof of this result can be found in the article 
"The Reduced Row Echelon Form of a Matrix Is Unique: A Simple Proof," by Thomas Yuster, Mathematics Magazine, Vol. 57, No. 
2, 1984, pp. 93-94.) In contrast, a row-echelon form of a given matrix is not unique: different sequences of row operations can 
produce different row-echelon forms. 




Karl Friedrich Gauss 



Karl Friedrich Gauss (1777-1855) was a German mathematician and scientist. Sometimes called the "prince of 
mathematicians," Gauss ranks with Isaac Newton and Archimedes as one of the three greatest mathematicians who ever lived. In 
the entire history of mathematics there may never have been a child so precocious as Gauss — by his own account he worked out 
the rudiments of arithmetic before he could talk. One day, before he was even three years old, his genius became apparent to his 
parents in a very dramatic way. His father was preparing the weekly payroll for the laborers under his charge while the boy 
watched quietly from a corner. At the end of the long and tedious calculation, Gauss informed his father that there was an error 
in the result and stated the answer, which he had worked out in his head. To the astonishment of his parents, a check of the 
computations showed Gauss to be correct! 

In his doctoral dissertation Gauss gave the first complete proof of the fundamental theorem of algebra, which states that every 
polynomial equation has as many solutions as its degree. At age 19 he solved a problem that baffled Euclid, inscribing a regular 
polygon of seventeen sides in a circle using straightedge and compass; and in 1801, at age 24, he published his first masterpiece, 
Disquisitiones Arithmeticae, considered by many to be one of the most brilliant achievements in mathematics. In that paper 
Gauss systematized the study of number theory (properties of the integers) and formulated the basic concepts that form the 
foundation of the subject. Among his myriad achievements, Gauss discovered the Gaussian or "bell-shaped" curve that is 
fundamental in probability, gave the first geometric interpretation of complex numbers and established their fundamental role in 
mathematics, developed methods of characterizing surfaces intrinsically by means of the curves that they contain, developed the 
theory of conformal (angle-preserving) maps, and discovered non-Euclidean geometry 30 years before the ideas were published 
by others. In physics he made major contributions to the theory of lenses and capillary action, and with Wilhelm Weber he did 
fundamental work in electromagnetism. Gauss invented the heliotrope, bifilar magnetometer, and an electrotelegraph. 



Gauss, who was deeply religious and aristocratic in demeanor, mastered foreign languages with ease, read extensively, and 
enjoyed mineralogy and botany as hobbies. He disliked teaching and was usually cool and discouraging to other mathematicians, 
possibly because he had already anticipated their work. It has been said that if Gauss had published all of his discoveries, the 
current state of mathematics would be advanced by 50 years. He was without a doubt the greatest mathematician of the modern 
era. 




Wilhelm Jordan 



Wilhelm Jordan (1842-1899) was a German engineer who specialized in geodesy. His contribution to solving linear systems 
appeared in his popular book, Handbuch der Vermessungskunde (Handbook of Geodesy), in 1888. 



EXAMPLE 4 Gauss-Jordan Elimination 



Solve by Gauss-Jordan elimination. 



*l 4 3*2 — 2*3 I 2*3 

2*i-| 6*2 — 5*3 —2*4 j 4*3 — 3*6 ; 

5*3 I 10*4 + 15*6 : 

2*i + 6*2 I 3*4 I 4*3 + 18*6: 





-1 

5 



Solution 



The augmented matrix for the system is 



1 


3 


-2 





2 





2 


6 


-5 


-2 


4 


-3 








5 


10 





15 


2 


6 





8 


4 


18 



Adding -2 times the first row to the second and fourth rows gives 

13-2 2 







0-1 -2 0-3 
5 10 15 
4 8 18 





-1 

5 

6 





-1 

5 

6 



Multiplying the second row by -1 and then adding -5 times the new second row to the third row and -4 times the new second row 
to the fourth row gives 

"l 3 









2 





2 


o o" 


1 


2 





3 1 























6 2 



Interchanging the third and fourth rows and then multiplying the third row of the resulting matrix by -f gives the row-echelon form 

6 

13 -2 2 0~ 

12 3 1 








1 I 

3 





Adding -3 times the third row to the second row and then adding 2 times the second row of the resulting matrix to the first row 
yields the reduced row-echelon form 

^13 4 2 0" 
12 
1 I 







The corresponding system of equations is 



*1 


+ 3* 2 


+4*4 
*3 + 2*4 


+2x 5 


*6 


= 
= 
_! 

3 



(We have discarded the last equation, 0*1 | 0*2 I 0*3 I 0*4 4- 0*^ + 0*^ = 0, since it will be satisfied automatically by the 
solutions of the remaining equations.) Solving for the leading variables, we obtain 

*1 = —3*2-4*4—2*5 
*3 = — 2*4 

If we assign the free variables *2, *4, and * 5 arbitrary values r, s, and t, respectively, the general solution is given by the formulas 

x\= —3r — 4s — 2t, *2 = f\ *3 = — 2s, *4 = s ? x$=t 9 x 6 = h 

4 

Back-Substitution 

It is sometimes preferable to solve a system of linear equations by using Gaussian elimination to bring the augmented matrix into 
row-echelon form without continuing all the way to the reduced row-echelon form. When this is done, the corresponding system of 
equations can be solved by a technique called back-substitution. The next example illustrates the idea. 



EXAMPLE 5 Example 4 Solved by Back-Substitution 



From the computations in Example 4, a row-echelon form of the augmented matrix is 

13-20200" 






12 3 1 





1 1 
3 









To solve the corresponding system of equations 

*l 4- 3*2 —2*3 +2*3 =0 

*3 +2*4 +3*6 = 1 



*.-! 



we proceed as follows: 

Step 1. Solve the equations for the leading variables. 



*1 = — 3*2 ! 2*3 — 2*^ 
*3 = 1 — 2*4 — 3*6 

Step 2. Beginning with the bottom equation and working upward, successively substitute each equation into all the equations above 
it. 



Substituting x ^ — 1 into the second equation yields 



*l = — 3*2 I 2*3 — 2*5 
*3 = — 2*4 



X6= 3 



Substituting * 3 = — 2*4 into the first equation yields 

* 1 = —3*2 — 4*4 — 2*5 
*3 = — 2*4 



z,-l 



Step 3. Assign arbitrary values to the free variables, if any. 

If we assign *2, *4, and * 5 the arbitrary values r, s, and f, respectively, the general solution is given by the formulas 

x\= —3r — 4s — 2t ? *2 = j\ *3 = — 2s, X4 = s, x$ = t, 
This agrees with the solution obtained in Example 4. 



*«-* 



Remark The arbitrary values that are assigned to the free variables are often called parameters. Although we shall generally use 
the letters r,s,t, ... for the parameters, any letters that do not conflict with the variable names may be used. 



EXAMPLE 6 Gaussian Elimination 



Solve 



by Gaussian elimination and back-substitution. 



x+ y + 2z = 9 
2x-\ 4y - 3z = 1 
3x-\ 6y-5z = 



Solution 

This is the system in Example 3 of Section 1 . 1 . In that example we converted the augmented matrix 

11 2 9" 

2 4-31 

3 6-50 



to the row-echelon form 



112 9 

1 -1 -if 

2 2 

1 3 



The system corresponding to this matrix is 



Solving for the leading variables yields 



7 _ _ 17 
^ 2 Z ~ 2 

Z = 3 



x = 9-y-2z 

^ 2 2 

Z=3 



Substituting the bottom equation into those above yields 



x = 3-y 

y = 2 

z=3 
and substituting the second equation into the top yields x = hy = 2 9 z = 3- This agrees with the result found by Gauss-Jordan 
elimination in Example 3 of Section 1.1. 

Homogeneous Linear Systems 

A system of linear equations is said to be homogeneous if the constant terms are all zero; that is, the system has the form 

^21*1+^22*2 l-""+fl2H*H=0 

a m \x\^-a m 2^2 ^" + a m^H = ° 
Every homogeneous system of linear equations is consistent, since all such systems have X [ = 0> X2 = 0> • • •> x H = as a solution. 
This solution is called the trivial solution; if there are other solutions, they are called nontrivial solutions. 

Because a homogeneous linear system always has the trivial solution, there are only two possibilities for its solutions: 
The system has only the trivial solution. 

The system has infinitely many solutions in addition to the trivial solution. 

In the special case of a homogeneous linear system of two equations in two unknowns, say 

(31^+^1^ = {a\, b\ not both zero) 

^2* + ^iy = (#2> ^2 not both zero) 

the graphs of the equations are lines through the origin, and the trivial solution corresponds to the point of intersection at the origin 
(Figure 1.2.1). 



tf|,f + fr|_Y = G 




H>Wfr,V = fl 



in) Only the trivial solution 




tffJl + ^iV = IP 

and 
a^ j + bry = 



{h) Infinitely many solutions 
Figure 1.2.1 



There is one case in which a homogeneous system is assured of having nontrivial solutions — namely, whenever the system involves 
more unknowns than equations. To see why, consider the following example of four equations in five unknowns. 



EXAMPLE 7 Gauss-Jordan Elimination 



Solve the following homogeneous system of linear equations by using Gauss-Jordan elimination. 



2*i i 2*2— *3 +*5 = 

— *i — *2 I 2*3 — 3*4 -h *^ = 

*1 -4 *2 — 2*3 —x$ = 

*3 +*4 + *^ = 



(1) 



Solution 



The augmented matrix for the system is 



2 


2 


-1 





1 


1 


-1 


2 


-3 


1 


1 


1 


-2 





-1 








1 


1 


1 



Reducing this matrix to reduced row-echelon form, we obtain 

110 10 

10 10 

10 

l o 

The corresponding system of equations is 



*l+*2 +X5 = 

x 3 +* 3 = (2) 

* 4 =0 

Solving for the leading variables yields 

*1= -X2-X5 
*3= -X5 

x 4 =0 

Thus, the general solution is 

X] = — s — t, x?.=s, x^= — i, ^4=0 ? x^ = t 
Note that the trivial solution is obtained when s = t = 0. 

Example 7 illustrates two important points about solving homogeneous systems of linear equations. First, none of the three 
elementary row operations alters the final column of zeros in the augmented matrix, so the system of equations corresponding to the 
reduced row-echelon form of the augmented matrix must also be a homogeneous system [see system 2]. Second, depending on 
whether the reduced row-echelon form of the augmented matrix has any zero rows, the number of equations in the reduced system 
is the same as or less than the number of equations in the original system [compare systems 1 and 2]. Thus, if the given 
homogeneous system has m equations in n unknowns with m < #, and if there are r nonzero rows in the reduced row-echelon form 
of the augmented matrix, we will have r < «. It follows that the system of equations corresponding to the reduced row-echelon form 
of the augmented matrix will have the form 

-**! +E() = o 

' * kr +E() = o 

where x^ v x^ 2 , . . ., x^ y are the leading variables and J2( ) denotes sums (possibly all different) that involve the w _ r free variables 
[compare system 3 with system 2 above]. Solving for the leading variables gives 

*k 2 = -EC) 
**,= -£() 

As in Example 7, we can assign arbitrary values to the free variables on the right-hand side and thus obtain infinitely many solutions 
to the system. 

In summary, we have the following important theorem. 
THEOREM 1.2.1 



A homogeneous system of linear equations with more unknowns than equations has infinitely many solutions. 



Remark Note that Theorem 1.2.1 applies only to homogeneous systems. A nonhomogeneous system with more unknowns than 
equations need not be consistent (Exercise 28); however, if the system is consistent, it will have infinitely many solutions. This will 
be proved later. 

Computer Solution of Linear Systems 

In applications it is not uncommon to encounter large linear systems that must be solved by computer. Most computer algorithms 
for solving such systems are based on Gaussian elimination or Gauss-Jordan elimination, but the basic procedures are often 
modified to deal with such issues as 



Reducing roundoff errors 



Minimizing the use of computer memory space 



Solving the system with maximum speed 



Some of these matters will be considered in Chapter 9. For hand computations, fractions are an annoyance that often cannot be 
avoided. However, in some cases it is possible to avoid them by varying the elementary row operations in the right way. Thus, once 
the methods of Gaussian elimination and Gauss-Jordan elimination have been mastered, the reader may wish to vary the steps in 
specific problems to avoid fractions (see Exercise 18). 

Remark Since Gauss-Jordan elimination avoids the use of back-substitution, it would seem that this method would be the more 
efficient of the two methods we have considered. 

It can be argued that this statement is true for solving small systems by hand since Gauss-Jordan elimination actually involves less 
writing. However, for large systems of equations, it has been shown that the Gauss-Jordan elimination method requires about 50% 
more operations than Gaussian elimination. This is an important consideration when one is working on computers. 



Exercise Set 1.2 



o 



Click here for Just Ask! 



1. 



Which of the following 3x3 matrices are in reduced row-echelon form? 



(a) 



"1 





0" 





1 











1 



(b) 



"1 





0" 





1 















(c) 



"0 


1 


0" 








1 












(d) 



"1 





0" 








1 












(e) 



"1 





0" 

















1 



(f) 



"0 


1 


0" 


1 


















(g) 



"1 


1 


0" 





1 















(h) 



"1 





2 





1 


3 












(i) 



"0 





r 





















(j) 



"0 





0" 





















Which of the following 3x3 matrices are in row-echelon form? 



(a) 



"1 





0" 





1 











1 



(b) 



"1 


2 


0" 





1 















(c) 



"1 





0" 





1 








2 






(d) 



"1 


3 


4" 








1 












(e) 



1 5 


-3" 


1 


1 









(f) 



"1 


2 


3~ 

















1 



In each part determine whether the matrix is in row-echelon form, reduced row-echelon form, both, or neither. 



(a) 



1 


2 





3 


o~ 








1 


1 

















1 


















(b) 



1 








5~ 








1 


3 





1 





4 



(c) 



10 3 1 
12 4 



(d) 



1-755 
13 2 



(e) 



1 


3 





2 


o" 


1 





2 


2 

















1 


















(f) 



"0 


0" 















In each part suppose that the augmented matrix for a system of linear equations has been reduced by row operations to the givei 
4« reduced row-echelon form. Solve the system. 



(a) 



(b) 



1 
1 
1 


-3 

7 




1 
1 
1 


-7 
3 
1 


8 

2 

-5 



(c) 



(d) 



1 


-6 








3 


-2 








1 


4 


7 











1 5 


8 

















1 


-3 





0" 












1 


















1 







In each part suppose that the augmented matrix for a system of linear equations has been reduced by row operations to the given 
5. row-echelon form. Solve the system. 



(a) 



1 


-3 4 7" 





1 2 2 





1 5 



(b) 



(c) 



10 8-5 
14-9 
1 1 


6 
3 
2 




17-20 
11 
1 



— 


8 
6 
3 




-3 
5 
9 




(d) 



1 


-3 7 f 





1 4 





1 



Solve each of the following systems by Gauss-Jordan elimination. 



(a) x\ I * 2 I 2 *3= 8 

— *i — 2*2 I 3*3 = 1 

3*1 -7*2 I 4* 3 = 10 



(b) 


2x\ 1 2*2 + 2x2 = 







— 2x\ 4- 5x2 + 2*3 = 


1 




8*i + x 2 + Ax 3 = - 


-1 


(c) 


x —y + 2z — w = 


-1 




2x + y — 2z — 2w — 


-2 




— x\2y — Az-\- w = 


1 




3x — 3w = 


-3 



(d) -2h \~3c = 1 
3a + 66 - 3c = - 2 
6a + 6b + 3c = 5 



Solve each of the systems in Exercise 6 by Gaussian elimination. 
7. 



Solve each of the following systems by Gauss-Jordan elimination. 
8. 



(a) 2xi -3*2= -2 
2*1 + *2 — 1 
3*1 + 2*2 = 1 

(b) 3*i + 2*2- * 3 = -15 
5*1 + 3*2 + 2*3= 
3*i + *2 + 3*3 = 11 

-6*1 -4*2 I 2*3= 30 

(c) 4*i-3*2= 12 
3*1-6*2= 9 

-2*i +4*2= -6 

(d) lOy -4z + w= 1 
* + 4^ — z +w= 2 

3* + 2^ +z + 2w = 5 

- 2* - By i 2z - 2w = - 4 

x-6y l-3z = 1 



Solve each of the systems in Exercise 8 by Gaussian elimination. 
9. 



10. 



Solve each of the following systems by Gauss-Jordan elimination. 



(a) 5*i -2*2 i 6*3 = 
— 2*i + *2 + 3*3 = 1 



(b) *i - 2*2 I *3- 4*4= 1 
*1 + 3*2+ 7*3+ 2*4 = 2 
*1 — 12*2 — H*3 — 16*4= 5 



(c) mM 2* -7 = 4 

* -y = 3 

w + 3* — 2^ = 7 

2u + 4v + uH 7* = 7 



Solve each of the systems in Exercise 10 by Gaussian elimination. 
11. 



Without using pencil and paper, determine which of the following homogeneous systems have nontrivial solutions. 
12. 



(a) 2*1 -3*2 I 4* 3 - * 4 = 
7*i -I *2 — 8*3 I- 9*4 = 
2*i + 8*2 4- *3 — *4 = 

(b) *i | 3*2- *3 = 

*2 - 8*3 = 
4*3 = 

(c) an*i I ai2*2 I ^13*3 = 
-321*1 I ^22*2 I <323*3 = 

(d) 3*1-2*2 = 
6*i -4*2 = 



Solve the following homogeneous systems of linear equations by any method. 
13. 



(a) 2*i I * 2 + 3*3 = 

*l + 2*2 = 
*2+ *3 = 

(b) 3*i I *2 I *3 I *4=0 
5*i — *2 I *3 — *4= 

(c) 2* + 2.y + 4z = 
w — y — 3z = 

2w + 3*+ y-\- z = 
-2w -h*-l 3^-2z = 



14. 



Solve the following homogeneous systems of linear equations by any method. 



(a) 2x- y-3z = 
-x-\ 2y-3z = 

x+ y I 4z = 

(b) v-| 3w-2x = 
2a I v-4w h3x = 
2a + 3v i 2w- x = 

-4a-3v | 5w-4* = 

(c) x\ I 3x 2 U4 = 
x\ + 4*2 + 2*3 =0 

— 2^2 — 2^3 — ^4 = 

2^i —4^2 I ^3 I ^4= 

*1 — 2^2 — *3 1-^4=0 



Solve the following systems by any method. 
15. 



(a) 2/i - I 2 I 3/3 I 4/4= 9 

1 1 -2/3 + 7/4=11 

3/i-3/2+ / 3 + 5/ 4 = 8 
2/i+ / 2 I 4/3 I 4/4=10 

(b) Z3 + Z 4 + Z 5 = Q 
— Zi — Z 2 I 2Z 3 -3Z 4 I Z 5 = 

Z\ I Z 2 -2Z 3 -Z$ = 

2Zi I 2Z 2 - Z3 + Z 5 = 



Solve the following systems, where a, b, and c are constants. 
16. 



(a) 2x+ y = a 
3x-\-6y = b 

(b) x\ + *2 + ^3 = a 
2x\ +2^3 = i 

3^2 + 3^3 = c 



For which values of a will the following system have no solutions? Exactly one solution? Infinitely many solutions? 
17. 

x I 2y- 3z = 4 

3x- y-\- 5z = 2 

4x+ y + (a 2 - 14)z = a + 2 



Reduce 



18. 



2 


1 


3 





-2 


-29 


3 


4 


5 



to reduced row-echelon form. 



19. 



Find two different row-echelon forms of 

1 3' 

2 7 



Solve the following system of nonlinear equations for the unknown angles a, p, and y, where < a < 2tt, < t 3 < 2tt, and 
20. 0<7<tt. 

2 sin a — cos ,3 + 3 tan 7 = 3 
4 sin a + 2 cos/? — 2 tan 7 = 2 
6 sinci — 3 cos ,3+ tan 7 = 9 



21. 



Show that the following nonlinear system has 18 solutions if < a < 2tt, < /? < 2tt, and < 7 < 2n\ 

sinctH- 2 cos ,3 + 3 tan 7 = 
2 sinct-h 5 cos ,3 + 3 tan 7 = 
— sin ct — 5 cos ;3 + 5 tan 7 = 



22. 



For which value(s) of X does the system of equations 
(A- 3)*+ y = 

x+(\-3)y = 
have nontrivial solutions? 



23. 



Solve the system 

2x\ — X2 = A*i 

2*i - *2 + *3 = ^2 
— 2*i + 2*2 -h ^3 = A*3 
for *i, *2 ? an d *3 i n the two cases \ — ], \ — 2- 



24. 



Solve the following system for x, y, and z. 



1+ 2_4 =1 

* 7 z 

7 + f + ! = ° 

* y z 



Find the coefficients a, &, c, and d so that the curve shown in the accompanying figure is the graph of the equation 

25 - y = ax 3 -\-hx 2 + cx + d. 



Find coefficients a, b, c, and d so that the curve shown in the accompanying figure is given by the equation 
26 - ax 2 + ay 2 + bx + cy + d? = 0. 




Figure Ex-25 



Figure Ex-26 



(4.-14) 




H.-3> 



27. 



(a) Show that if ad — be * 0» then the reduced row-echelon form of 



a b 


is 


"1 0" 


c d 




[}} lj 



(b) Use part (a) to show that the system 

ax + by = k 
ex + dy = 1 

has exactly one solution when ad-bc^0- 



28. 



Find an inconsistent linear system that has more unknowns than equations. 



Discussion 
Discovery 



29. 



Indicate all possible reduced row-echelon forms of 



(a) 



(b) 



a 


b c 




d 


* / 




g 


h i 




a 


b c 


d 


e 


I g 


h 


i 


J k 


I 


?n 


n p 


q 



30. 



Consider the system of equations 



ax -\-hy = 
ex + dy = 
ex+fy = 

Discuss the relative positions of the lines ax -\- by = 0, ex 4- dy = 0, and ex j y^ = when (a) the 
system has only the trivial solution, and (b) the system has nontrivial solutions. 

Indicate whether the statement is always true or sometimes false. Justify your answer by giving a 
31. logical argument or a counterexample. 



(a) If a matrix is reduced to reduced row-echelon form by two different sequences of elementary 
row operations, the resulting matrices will be different. 



(b) If a matrix is reduced to row-echelon form by two different sequences of elementary row 
operations, the resulting matrices might be different. 



(c) If the reduced row-echelon form of the augmented matrix for a linear system has a row of 
zeros, then the system must have infinitely many solutions. 



(d) If three lines in the ^y-plane are sides of a triangle, then the system of equations formed from 
their equations has three solutions, one corresponding to each vertex. 



Indicate whether the statement is always true or sometimes false. Justify your answer by giving a 
32. logical argument or a counterexample. 



(a) A linear system of three equations in five unknowns must be consistent. 

(b) A linear system of five equations in three unknowns cannot be consistent. 



(c) If a linear system of n equations in n unknowns has n leading l's in the reduced row-echelon 
form of its augmented matrix, then the system has exactly one solution. 



(d) If a linear system of n equations in n unknowns has two equations that are multiples of one 
another, then the system is inconsistent. 
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1.3 

MATRICES AND MATRIX 
OPERATIONS 



Rectangular arrays of real numbers arise in many contexts other than as 
augmented matrices for systems of linear equations. In this section we begin 
our study of matrix theory by giving some of the fundamental definitions of 
the subject. We shall see how matrices can be combined through the 
arithmetic operations of addition, subtraction, and multiplication. 



Matrix Notation and Terminology 

In Section 1.2 we used rectangular arrays of numbers, called augmented matrices, to abbreviate systems of linear equations. 
However, rectangular arrays of numbers occur in other contexts as well. For example, the following rectangular array with 
three rows and seven columns might describe the number of hours that a student spent studying three subjects during a 
certain week: 





Mon. 


Tues. 


Wed. 


Thurs. 


Fri. 


Sat. 


Sun. 


Math 


2 


3 


2 


4 


1 


4 


2 


History 





3 


1 


4 


3 


2 


2 


Language 


4 


1 


3 


1 








2 



If we suppress the headings, then we are left with the following rectangular array of numbers with three rows and seven 
columns, called a "matrix": 

"2324142" 

3 14 3 2 2 

l 4 1 3 1 2 

More generally, we make the following definition. 



DEFINITION 



A matrix is a rectangular array of numbers. The numbers in the array are called the entries in the matrix. 



EXAMPLE 1 Examples of Matrices 



Some examples of matrices are 



1 


2" 


3 





-1 


4 



[2 1 -3], 



e n 


-fi\ 










1 


o 1 


1 


' 


3 











[4] 



The size of a matrix is described in terms of the number of rows (horizontal lines) and columns (vertical lines) it contains. 
For example, the first matrix in Example 1 has three rows and two columns, so its size is 3 by 2 (written 3 x 2). In a size 
description, the first number always denotes the number of rows, and the second denotes the number of columns. The 
remaining matrices in Example 1 have sizes 1x4, 3x3, 2x1, and lxl, respectively. A matrix with only one column is 
called a column matrix (or a column vector), and a matrix with only one row is called a row matrix (or a row vector). Thus, 
in Example 1 the 2 x 1 matrix is a column matrix, the 1 x 4 matrix is a row matrix, and the lxl matrix is both a row matrix 
and a column matrix. (The term vector has another meaning that we will discuss in subsequent chapters.) 

Remark It is common practice to omit the brackets from a 1 x 1 matrix. Thus we might write 4 rather than [4]. Although 
this makes it impossible to tell whether 4 denotes the number "four" or the lxl matrix whose entry is "four," this rarely 
causes problems, since it is usually possible to tell which is meant from the context in which the symbol appears. 



We shall use capital letters to denote matrices and lowercase letters to denote numerical quantities; thus we might write 



,4 = 



2 1 7 

3 4 2 



or 



C = 



a b c 



When discussing matrices, it is common to refer to numerical quantities as scalars. Unless stated otherwise, scalars will be 
real numbers', complex scalars will be considered in Chapter 10. 

The entry that occurs in row i and column j of a matrix A will be denoted by ay. Thus a general 3 X 4 matrix might be written 
as 

"ail a 12 ai3 ^14" 
^4= ^21 ^22 ^23 ^24 
(331 fl 32 fl 33 fl 34 



and a general m x n matrix as 



,4 = 



an a 12 - aiH 
^21 ^22 - <32h 



Ami &m2 



a 



mn 



(i) 



When compactness of notation is desired, the preceding matrix can be written as 

the first notation being used when it is important in the discussion to know the size, and the second being used when the size 
need not be emphasized. Usually, we shall match the letter denoting a matrix with the letter denoting its entries; thus, for a 
matrix B we would generally use h^ for the entry in row i and column j, and for a matrix C we would use the notation c^. 

The entry in row i and column j of a matrix A is also commonly denoted by the symbol (j4)y. Thus, for matrix 1 above, we 
have 



(A) iJ = a ij 



and for the matrix 



,4 = 



we have (A) n = 2, (A) 12 = - 3, (,4) 21 = 7, and (,4) 22 = q. 



Row and column matrices are of special importance, and it is common practice to denote them by boldface lowercase letters 
rather than capital letters. For such matrices, double subscripting of the entries is unnecessary. Thus a general lx« row 
matrix a and a general mx\ column matrix b would be written as 



a= [a\ &2 L,J &n] and b = 



h 
h 



A matrix A with n rows and n columns is called a square matrix of order n, and the shaded entries a 1 1, «22> • • • , tf HH in 2 are 
said to be on the main diagonal of A. 



a n a 12 - 
fl„i fl H 2 " 



" ^2h 



(2) 



Operations on Matrices 

So far, we have used matrices to abbreviate the work in solving systems of linear equations. For other applications, however, 
it is desirable to develop an "arithmetic of matrices" in which matrices can be added, subtracted, and multiplied in a useful 
way. The remainder of this section will be devoted to developing this arithmetic. 



DEFINITION 



Two matrices are defined to be equal if they have the same size and their corresponding entries are equal. 



In matrix notation, if A = [a^] and B = [&,-,-] have the same size, then A — B if and only if {A)u = (B)ip or, equivalently, 
flj-,- = bjj for all i andy'. 



EXAMPLE 2 Equality of Matrices 



Consider the matrices 



A = 



2 1 

3 x 



B = 



2 1 

3 5 



C = 



2 1 

3 4 



If x = 5, then A = B, but for all other values of x the matrices A and B are not equal, since not all of their corresponding 
entries are equal. There is no value of x for which A = C since A and C have different sizes. 



DEFINITION 



If A and B are matrices of the same size, then the sum A I B is the matrix obtained by adding the entries of B to the 
corresponding entries of A, and the difference A — B is the matrix obtained by subtracting the entries of B from the 
corresponding entries of A. Matrices of different sizes cannot be added or subtracted. 



In matrix notation, if A= [a i} ■] and B = [b^] have the same size, then 



(A + B) ij = (A) ij + (B) ij = fl ,j + iy and (A - B) „ = (J) y - (5) y = fly - i>y 



EXAMPLE 3 Addition and Subtraction 



Consider the matrices 



A = 



2 


1 3" 




1 


2 4 


, 5 = 


4 


-2 7 





■4 3 5 1 

2 2 0-1 

3 2-4 5 



C = 



1 1 

2 2 



Then 



A + B = 



-2 4 5 


4" 


1 2 2 


3 


7 3 


5 



and A - 5 = 



6-2-5 2 

-3-2 2 5 

1 -4 11 -5 



The expressions A I C.5 I C, A — C> and B — C are undefined. 



DEFINITION 



If A is any matrix and c is any scalar, then the product C A is the matrix obtained by multiplying each entry of the matrix A 
by c. The matrix C A is said to be a scalar multiple of A. 



In matrix notation, if A — [a,-,], then 



M)y=c(,4)y=Cfly 



EXAMPLE 4 Scalar Multiples 



For the matrices 



we have 



,4 = 



2 3 4 
1 3 1 



B = 



2 7 
-1 3 -5 



C = 



9-6 3 
3 12 



2^4 = 



4 6 3 
2 6 2 



(-1)5 = 



-2 -7 

1 -3 5 



i c = 



3 -2 1 
1 4 



It is common practice to denote ( _ \)B by _ 5. 



If A\> ^2' *••' -^n are matr i ces of the same size and ci, C2» ••■,c H are scalars, then an expression of the form 

c\A\ +C2A2 + -' + c yi A yi 

is called a linear combination of ^ , j[ 2 , • • • , jl M with coefficients c\,C2, • • • , c M . For example, if A, 5, and C are the matrices 
in Example 4, then 



2A-B I ±C=2A I (-1)5 I iC 



4 6 8 
2 6 2 



-2 -7 

1 -3 5 



-2 1 
4 



7 2 
4 3 



2 
11 



is the linear combination of A, B, and C with scalar coefficients 2, -1, and ^. 



Thus far we have defined multiplication of a matrix by a scalar but not the multiplication of two matrices. Since matrices are 
added by adding corresponding entries and subtracted by subtracting corresponding entries, it would seem natural to define 
multiplication of matrices by multiplying corresponding entries. However, it turns out that such a definition would not be 
very useful for most problems. Experience has led mathematicians to the following more useful definition of matrix 
multiplication. 



DEFINITION 



If A is an m x r matrix and B is an r x n matrix, then the product A3 is the m x n matrix whose entries are determined as 
follows. To find the entry in row i and column j of AB> single out row i from the matrix A and column j from the matrix B. 
Multiply the corresponding entries from the row and column together, and then add up the resulting products. 



EXAMPLE 5 Multiplying Matrices 



Consider the matrices 



A = 



1 2 4 

2 6 



5 = 



4 


1 4 3" 





-1 3 1 


2 


7 5 2 



Since A is a 2 x 3 matrix and B is a 3 x 4 matrix, the product AS is a 2 x 4 matrix. To determine, for example, the entry in 
row 2 and column 3 of AB, we single out row 2 from A and column 3 from B. Then, as illustrated below, we multiply 
corresponding entries together and add up these products. 



1 2 4 

2 6 



"4 


1 


4 


3" 







-1 


3 


1 


= 


_2 


7 


5 


2_ 





26 



(2 -4) + (6 -3) + (0-5) = 26 
The entry in row 1 and column 4 of AB is computed as follows: 



1 2 4 

2 6 



4 


1 4 


3" 







-1 3 


1 


= 


2 


7 5 


2_ 





13 



(1-3) + (2-1) + (4 -2) = 13 



The computations for the remaining entries are 



0-4)H 


h (2-0) 


h(4-2) = 


12 








(1-1)- 


-(2-1) 


h(4-7) = 


27 








(1-4)H 
(2-4) H 


h (2-3) 
h (6-0) 


h(4-5) = 
h(0-2) = 


30 
3 


,45 = 


"12 
_ 3 


27 30 13 
-4 26 12 


(2-1)- 


-(6-1) 


h(0-7) = 


-4 








(2-3)H 


h(6-l) 


h(0-2) = 


12 









The definition of matrix multiplication requires that the number of columns of the first factor A be the same as the number of 
rows of the second factor B in order to form the product AB- If this condition is not satisfied, the product is undefined. A 
convenient way to determine whether a product of two matrices is defined is to write down the size of the first factor and, to 
the right of it, write down the size of the second factor. If, as in 3, the inside numbers are the same, then the product is 
defined. The outside numbers then give the size of the product. 



A 
m x r r 
inside 


B 

x n = 


AB 

m x n 




Omsicte 





(3) 



EXAMPLE 6 Determining Whether a Product Is Defined 



Suppose that A, B, and C are matrices with the following sizes: 

ABC 
3x4 4 X 7 7x3 

Then by 3, AB is defined and is a 3 x 7 matrix; BC is defined and is a 4 x 3 matrix; and CA is defined and is a 7 x 4 matrix. 
The products AC, CB> an d 5^4 are all undefined. 



In general, if A = [a^] is an m x r matrix and B= [Ay] is an r x n matrix, then, as illustrated by the shading in 4, 



"an 

«21 


<*\2 • 
«22 ■ 


" &2r 


an 


<*i2 ■ 


■ air 


a m\ 


a m2 • 


' a mr 



b n b\2 - 


*V 


- b\n 


b 2 \ b22 - 


hj 


- hn 


b r \ b r2 - 


b rj 


- &rn 



AB = 



the entry (AB)u in row i and column j of AB is given by 

(AB) jj = aab \j 4- fl,-^- I «i3*3j +■ ™ + ^i A; 



(4) 



(5) 



Partitioned Matrices 

A matrix can be subdivided ox partitioned into smaller matrices by inserting horizontal and vertical rules between selected 
rows and columns. For example, the following are three possible partitions of a general 3x4 matrix A — the first is a partition 
of A into four submatrices A\\> A\2> A21' an< ^ A 2 2^ me second is a partition of A into its row matrices rj, r 2 , an d 1*3; and the 
third is a partition of A into its column matrices ci, c 2 , 03, and C4: 



A = 



A = 



A = 



(12\ an 



a\s 

"23 



03 j dyi 
Q\\ ttj2 



<>■■? 



«|4 

(124 



"\-i 






an u u 



"21 



tl22 a^y Ct 24 



Qjl Qyi Gy$ flj»4 






flu 


^12 


ff| : 


rt|4 


C121 


022 


an 


a lA 


J*JI 


^32 


^33 


a - 14 „ 



= fc, C 2 Cj c 4 ] 



Matrix Multiplication by Columns and by Rows 

Sometimes it may be desirable to find a particular row or column of a matrix product AB without computing the entire 
product. The following results, whose proofs are left as exercises, are useful for that purpose: 

jth column matrix of AB = A[j\h column matrix of 5] 



(6) 



zth row matrix of AB = [zth row matrix of A] B 



(7) 



EXAMPLE 7 Example 5 Revisited 



If A and B are the matrices in Example 5, then from 6 the second column matrix of AB can be obtained by the computation 

1' 



1 2 4 

2 6 



-1 
7 



27 
-4 



T T 

Second column Second column 

of/J ofA5 

and from 7 the first row matrix of AB can be obtained by the computation 

"4 14 3 
First row of A -* [1 2 4] 



0-131 
2 7 5 2 



= [12 27 30 13] «- First row of AB 



If ai, a 2 » •••> *» m denote the row matrices of A and bj, b 2 > •••> b M denote the column matrices of B, then it follows from 
Formulas 6 and 7 that 

^ = ^[1)1 b 2 - b H ] = [,4bi ,4b 2 - Ah„] 

[AB computed column by commit) 



(8) 





"ai" 




\lB~ 


AB = 


*2 


B = 


*jB 




^m 




a m 5 



(9) 



(AB compute <l row by row) 



Remark Formulas 8 and 9 are special cases of a more general procedure for multiplying partitioned matrices (see Exercises 
15, 16 and 17). 

Matrix Products as Linear Combinations 

Row and column matrices provide an alternative way of thinking about matrix multiplication. For example, suppose that 



Then 



,4 = 



a\\ au - ai„ 
^21 ^22 - <32h 



a m \ a m2 



a? 



and 



x = 



*1 
*2 



^4x = 



^21*1 + ^22*2 (-- + ^2h^h 
tfml*l + tfm2*2 ^'" + ^h^ 





"ail" 




"«12 - 


= T1 


«21 


+ *2 


«22 




&m\ 




«m2 



+ ■" + *„ 



fl 2H 



(3 



m« 



(10) 



In words, 10 tells us that the product Ax of a matrix A with a column matrix x is a linear combination of the column matrices 
of A with the coefficients coming from the matrix x. In the exercises we ask the reader to show that the product yA of a \ xm 
matrix y with an mxn matrix A is a linear combination of the row matrices of A with scalar coefficients coming from y. 



EXAMPLE 8 Linear Combinations 



The matrix product 



-1 3 


2" 


2 




f 


1 2 


-3 


-1 


= 


-9 


2 1 


-2 


3 




-3 



can be written as the linear combination of column matrices 



~-l" 




"3" 




2" 




f 


1 


-1 


2 


1 3 


-3 


= 


-9 


2 




1 




-2 




-3 



The matrix product 



[1 -9 -3] 



-13 2 

1 2 -3 

2 1 -2 



= [_16 -18 35] 



can be written as the linear combination of row matrices 

1[-1 3 2] — 9[1 2 —3] — 3[2 1 — 2] = [ — 16 -13 35] 



It follows from 8 and 10 that the jth column matrix of a product AB is a linear combination of the column matrices of A with 
the coefficients coming from the jth column ofB. 



EXAMPLE 9 Columns of a Product AB as Linear Combinations 



We showed in Example 5 that 



AB = 



1 2 4 

2 6 



4 


1 4 3" 







-1 3 1 


= 


2 


7 5 2 





12 27 30 13 
3 -4 26 12 



The column matrices of AB can be expressed as linear combinations of the column matrices of A as follows: 





"12" 


= 4 


"1" 


1 o 


"2" 


1 2 


"4" 




L y J 




_2\ 




L 6 J 




L°J 


27" 




V 




"2" 


+ 7 


"4" 




-4J 




|_2_ 




W 




L°J 



"30" 


= 4 


"1" 


I 3 


"2" 


+ 5 


"4" 


|_2bJ 




l 2 \ 




[bj 




L°J 


"13" 


= 3 


"1" 


+ 


"2" 


+ 2 


"4" 


[vz\ 




[2J 




[bj 




L°J 



Matrix Form of a Linear System 

Matrix multiplication has an important application to systems of linear equations. Consider any system of m linear equations 
in n unknowns. 

a 11*1 + ai2*2 H 1 a\ n *n = b\ 

^21*1 +^22^2 H 1 ^E2h^h = *2 



#ml*l I ^m2^2H r-tf mH *H=£m 

Since two matrices are equal if and only if their corresponding entries are equal, we can replace the m equations in this 
system by the single matrix equation 



^21*1 + <322*2 ^-+^2h*h 
a m l*l+a™2*2 |_-« + fl mM ^ M 



*1 
*2 



The m x 1 matrix on the left side of this equation can be written as a product to give 

AH an 
«21 fl 22 



01 H " 


~*1~ 




~h~ 


a 2n 


*2 


= 


h 


a mn 


*M 




bm 



If we designate these matrices by A, x, and b, respectively, then the original system of m equations in n unknowns has been 
replaced by the single matrix equation 

The matrix A in this equation is called the coefficient matrix of the system. The augmented matrix for the system is obtained 
by adjoining b to A as the last column; thus the augmented matrix is 



[^Ib] = 



a n a u 



a\ n 
^2h 






Matrices Defining Functions 

The equation Ax =b with A and b given defines a linear system to be solved for x. But we could also write this equation as 
y = Ax, where A and x are given. In this case, we want to compute y. If A is m x ^ , then this is a function that associates with 
every n x 1 column vector x an m x 1 column vector y, and we may view A as defining a rule that shows how a given x is 
mapped into a corresponding y. This idea is discussed in more detail starting in Section 4.2. 



EXAMPLE 10 A Function Using Matrices 



Consider the following matrices. 



,4 = 



1 
-1 



x = 



The product y = Ax is 

~i oir^i __ r a 

so the effect of multiplying A by a column vector is to change the sign of the second entry of the column vector. For the 
matrix 

1" 



B = 



-1 



the product y = Bx is 



1 
-1 



b 



so the effect of multiplying B by a column vector is to interchange the first and second entries of the column vector, also 
changing the sign of the first entry. 

If we view the column vector x as locating a point (a,b) in the plane, then the effect of A is to reflect the point about the 
x-axis (Figure 1.3.1a) whereas the effect of B is to rotate the line segment from the origin to the point through a right angle 
(Figure 1.3.1/7). 



l& 






,■■ 




m % 

I 


X 














fl 




-4> 




*y 






Transpose of a Matrix 

We conclude this section by defining two matrix operations that have no analogs in the real numbers. 



DEFINITION 



If A is any ^ x « matrix, then the transpose of A, denoted by ^^ is defined to be the n x m matrix that results from 
interchanging the rows and columns of A; that is, the first column of ^ is the first row of A, the second column of ^ is 
the second row of A, and so forth. 



EXAMPLE 1 1 Some Transposes 



The following are some examples of matrices and their transposes. 





a n 


«12 


013 


014 


A = 


021 


022 


023 


024 




031 


032 


033 


034 




"flu 


021 


031 " 




A T = 


«12 
013 


022 
023 


032 
033 


? 




ai4 


024 


034 





B = 



2 3 
1 4 
5 6 



B T = 



2 1 5 

3 4 6 



C=[l 3 5], 



C T = 



D=[4] 



D T =[4] 



Observe that not only are the columns of ^ the rows of A, but the rows of ^ are the columns of A. Thus the entry in row i 



and column j of ^ is the entry in rowj and column i of A; that is, 



(,0 !7 = (,% 



(11) 



Note the reversal of the subscripts. 

In the special case where A is a square matrix, the transpose of A can be obtained by interchanging entries that are 
symmetrically positioned about the main diagonal. In 12 it is shown that jfi can also be obtained by "reflecting" A about its 
main diagonal. 



A - 



I -2 

3 7 

-5 8 





ft 



-5 (8f ft 

1 

Interchange entries thui lwc 

>ym metrically pciriijoiitd 
iflfcttitt the mum Jiuttmul. 



1 7- = 



1 


^ 


-5~ 


2 


; 


8 


J 


(} 


6 



(12) 



DEFINITION 



If A is a square matrix, then the trace of A, denoted by tr(j4)> is defined to be the sum of the entries on the main diagonal 
of A. The trace of A is undefined if A is not a square matrix. 



EXAMPLE 1 2 Trace of a Matrix 



The following are examples of matrices and their traces. 



,4 = 



a n 


«12 


fll3" 




<*2\ 


a 22 


«23 


, 5 = 


<=<31 


« 32 


a 33 





-12 7 
3 5-84 
12 7-3 
4-210 

tr(j4) =flii +«22 + ^33 tr(5) = -1 + 5 + 7 + = 11 



Exercise Set 1 .3 



© 



Click here for Just Ask! 



Suppose that A, B, C, D, and E are matrices with the following sizes: 



A B C D E 

(4x5) (4x5) (5x2) (4x2) (5x4) 

Determine which of the following matrix expressions are defined. For those that are defined, give the size of the resulting 
matrix. 



(a) BA 

(b) AC I D 

(c) AE I B 

(d) AB + B 

(e) E{A I B) 

(f) E(AC) 

(g) E T A 

(h) (A T + E)D 

Solve the following matrix equation for a, b, c, and d. 



a—b b+c 
IdA c 2a -Ad 



8 1 
7 6 



Consider the matrices 

3 

-1 2 

1 1 



A = 



B = 



D = 



1 5 2 
1 1 
3 2 4 



E = 



C = 



1 4 2 
3 1 5 



6 1 3 

■1 1 2 

4 1 3 



Compute the following (where possible). 



(a) D I fl 



4. 



(b) D-E 

(c) 5A 

(d) -7C 

(e) 25 -C 

(f) 45 -2Z5 

(g) -3(£) I 2E) 
(h) j4-j4 

(i) tr(£>) 

(j) tr(D-3£) 

(k) 4tr(75) 

(1) tr(,4) 

Using the matrices in Exercise 3, compute the following (where possible). 

(a) 2A T + C 

(b) D T_ E T 

(c) {D-E) 7 

(d) B T I 5C r 



(e) lc r -±i4 
2 4 



(f) 5-5 r 



T , n I 



(g) 2£'-3D 



( h ) (2E T -3D T ) T 



Using the matrices in Exercise 3, compute the following (where possible). 
5. 



(a) ^ 

(b) BA 

(c) (3£)£> 

(d) (AB) C 

(e) 4(5C) 

(f) CC T 

(g) (ZL4) r 
(h) (C^)^ 1 
(i) tr(DD T ) 

(j) tr(4£ r -£)) 
(k) tr(C T A T +2E T ) 



Using the matrices in Exercise 3, compute the following (where possible). 
6. 



(a) (2D T -E)A 

(b) (AB) C I 25 



(c) (-AC) T +5D T 

( d ) (BA T -2C) T 

(e) 3 T (CC T -A T A) 

(f) D T E T -(ED) T 



Let 



,4 = 



-2 7 
5 4 
4 9 



and B = 



6 


-2 4 





1 3 


7 


7 5 



Use the method of Example 7 to find 



(a) the first row of AB 



8. 



(b) the third row of AB 



(c) the second column of AB 



(d) the first column of BA 



(e) the third row of AA 



(f) the third column of AA 



Let A and B be the matrices in Exercise 7. Use the method of Example 9 to 



(a) express each column matrix of AB as a linear combination of the column matrices of A 



(b) express each column matrix of BA as a linear combination of the column matrices of B 



Let 



y= [71 72 - y™] and A = 



a n 


^12 " 


" GIN 


G21 


^22 " 


" <32h 


tfml 


tim2 m 


" fl mH 



(a) Show that the product yA can be expressed as a linear combination of the row matrices of A with the scalar 
coefficients coming from y. 



(b) Relate this to the method of Example 8. 
Hint Use the transpose operation. 



Let A and B be the matrices in Exercise 7. 
10. 



(a) Use the result in Exercise 9 to express each row matrix of AB as a linear combination of the row matrices of B. 

(b) Use the result in Exercise 9 to express each row matrix of BA as a linear combination of the row matrices of A. 



Let C, D, and E be the matrices in Exercise 3. Using as few computations as possible, determine the entry in row 2 and 
11- column 3 of C(DE)- 



12. 

(a) Show that if AB and BA are both defined, then AB and BA are square matrices. 



(b) Show that if A is an m x w matrix and A(BA) is defined, then B is an fl x m matrix. 



In each part, find matrices A, x, and b that express the given system of linear equations as a single matrix equation 
13 - Av = b. 



(a) 2*i-3*2 I 5*3= 7 
9*1 - * 2 H- *3= - 1 

*1 +5*2+4*3= 

(b) 4*i -3*3 I * 4 =1 
5*i + *2 — 3*4 = 3 
2*i — 5*2 I 9*3— *4 = 

3*2 — *3 I 7*4= 2 



In each part, express the matrix equation as a system of linear equations. 
14. 



(a) 



3 

4 

■2 



1 2" 


"*l" 




2" 


3 7 


*2 


= 


-1 


1 5 


*3 




4 



(b) 



2 

2 

1 4 
5 1 



f 


~M>~ 




"o" 


-2 
7 


* 
7 


= 






6 


Z 








15. 



If A and B are partitioned into submatrices, for example, 

Am 



then AB can be expressed as 






and # = 



4fl = 






""[Ml 1 



-4nfiiz + A| 2 fe 



A*} B]3 +- d^2 5^2 



provided the sizes of the submatrices of A and B are such that the indicated operations can be performed. This method of 
multiplying partitioned matrices is called block multiplication. In each part, compute the product by block 
multiplication. Check your results by multiplying directly. 



(a) 



A = 



-] 


2 


] 5" 





-3 


4 2 


1 


5 


6 I 



B = 



~ 2 


1 


4T 


-3 


5 


2 


7 


-1 


5 





3 


-3_ 



(b) 



A = 



-1 


2 


1 


5 





-3 


4 


2 


1 


5 


6 


1 



B = 



2 


I 


4" 


-3 


5 


2 


7 


^1 


5 


(J 


3 


-3_ 



16. 



Adapt the method of Exercise 15 to compute the following products by block multiplication. 



(a) 



3 -1 

2 1 





4 



i] 



'2 


-4 


1' 


3 





^ 


] 


-3 


5 



(b) 



"2 


-5" 




1 


3 







5 




1 


4 





-I 



3 -4 

5 7 



(c) 



1 


(i 


(.1 











i 


n 














I 

















2 





n 


o 


[) 


-1 


- 1 





r 3 


.n 




-i 
i 


4 
5 


2 


-2 




1 


(>_ 



In each part, determine whether block multiplication can be used to compute AB from the given partitions. If so, compute 
17. the product by block multiplication. 

Note See Exercise 15. 



(a) 



A = 



-J 


2 


1 


5 





-3 


4 


2 


I 


5 


fl 


1 



(b) 



A = 



-1 


: 


] 


5 





-3 


4 


2 


1 


5 


6 


1 



fl = 



5 = 



2 


1 


4 


-3 


5 


2 


7 


-I 


5 





3 


-3. 


2 


1 


4' 


-3 


5 


2 


7 


-1 


5 





3 


-3 



18. 



(a) Show that if A has a row of zeros and B is any matrix for which AB is defined, then AB also has a row of zeros. 



(b) Find a similar result involving a column of zeros. 



Let A be any ^ x « matrix and let be the m x m matrix each of whose entries is zero. Show that if kA = ft then fc — Q or 
19- ,4 = 0- 



20. 



Let 7 be the BX « matrix whose entry in row i and column 7 is 

fl if i=j 

10 if ^^ 



Show that AT = 7j4 = ^4 for every n x m matrix A. 



In each part, find a 6 x 6 matrix [a 2 , ] that satisfies the stated condition. Make your answers as general as possible by 
^ A * using letters rather than specific numbers for the nonzero entries. 



(a) ajj = if i * j 



(b) ctij = if i > j 



(c) fl^ = if j < j 



(d) fll j = if li-jl>l 



22. 



Find the 4x4 matrix A = [a,-,-] whose entries satisfy the stated condition. 



(a) 3,^ = 1 + j 



(b) ^• = ! -'- 1 



23. 



( c ) ( 1 if |i-^|>l 

fl v = _i if |i-j|<l 



Consider the function y — f (x) defined for 2 x 1 matrices x by y — Ax, where 



j4 = 



1 1 
1 



Plot f (x) together with x in each case below. How would you describe the action of/? 



(a) 



X = 



(b) 



X = 



(c) 



X = 



(d) ._ 



2 
-2 



Let A be a n x m matrix. Show that if the function y = J ( x ) defined for m x 1 matrices x by y = Ax satisfies the linearity 
24- property, then f (aw + f3z) = af (w) + fif (z) for any real numbers a and p and any ^ x l matrices w and z. 



25. 



Prove: If A and B are n x n matrices, then trf^ V B)= tr(^4) + tr(5)- 



Discussion 

Discov&rv Describe three different methods for computing a matrix product, and illustrate the methods by 



26. computing some product AB three different ways. 



27. 



How many 3 x 3 matrices A can you find such that 



x~ 




x+y 


y 


= 


x-y 


z 








28. 



for all choices of x, y, and z? 

How many 3 x 3 matrices A can you find such that 



~x~ 




xy 


y 


= 





z 








for all choices of x, y, and z? 



29. 



A matrix 5 is said to be a square root of a matrix A if BB = A 



(a) 



Find two square roots of A = 



2 2 
2 2 



(b) 



How many different square roots can you find of A = 



5 
9 



(c) Do you think that every 2x2 matrix has at least one square root? Explain your 
reasoning. 



30. 



Let denote a 2 x 2 matrix, each of whose entries is zero. 



(a) Is there a 2 x 2 matrix A such that A*0 and AA = 0? Justify your answer. 



(b) Is there a 2 x 2 matrix A such that A^O and AA = A^ Justify your answer. 



Indicate whether the statement is always true or sometimes false. Justify your answer with a 
31. logical argument or a counterexample. 



(a) The expressions tr(AA ) and tr(A A) are always defined, regardless of the size of A. 

(b) tr ( AA T )=tr(A T A) for every matrix A. 

(c) If the first column of A has all zeros, then so does the first column of every product AB- 

(d) If the first row of A has all zeros, then so does the first row of every product AB- 



Indicate whether the statement is always true or sometimes false. Justify your answer with a 
32. logical argument or a counterexample. 



(a) If A is a square matrix with two identical rows, then AA has two identical rows. 



(b) If A is a square matrix and AA has a column of zeros, then A must have a column of 
zeros. 



(c) If B is an^xw matrix whose entries are positive even integers, and if A is an ^ x w 
matrix whose entries are positive integers, then the entries of AB and BA are positive 
even integers. 



(d) If the matrix sum AB I 5^4 is defined, then A and B must be square. 
Suppose the array 



33. 



4 3 3 
2 1 
4 4 2 

represents the orders placed by three individuals at a fast-food restaurant. The first person 
orders 4 burgers, 3 sodas, and 3 fries; the second orders 2 burgers and 1 soda, and the third 
orders 4 burgers, 4 sodas, and 2 fries. Burgers cost $2 each, sodas $1 each, and fries $1.50 
each. 



(a) Argue that the amounts owed by these persons may be represented as a function 
y = f (x), where f(x) is equal to the array given above times a certain vector. 



(b) Compute the amounts owed in this case by performing the appropriate multiplication. 



(c) Change the matrix for the case in which the second person orders an additional soda 
and 2 fries, and recompute the costs. 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



1.4 

INVERSES; RULES OF 
MATRIX ARITHMETIC 



In this section we shall discuss some properties of the arithmetic operations 
on matrices. We shall see that many of the basic rules of arithmetic for real 
numbers also hold for matrices, but a few do not. 



Properties of Matrix Operations 

For real numbers a and b, we always have a b = ba, which is called the commutative law for multiplication. For matrices, 
however, AB and BA need not be equal. Equality can fail to hold for three reasons: It can happen that the product AB is defined 
but BA is undefined. For example, this is the case if A is a 2 x 3 matrix and B is a 3 x 4 matrix. Also, it can happen that AB and 
BA are both defined but have different sizes. This is the situation if A is a 2 x 3 matrix and B is a 3 x 2 matrix. Finally, as 
Example 1 shows, it is possible to have AB * BA even if b°th AB and BA are defined and have the same size. 



EXAMPLE 1 AB and BA Need Not Be Equal 



Consider the matrices 



Multiplying gives 



Thus, AB ±BA- 



A = 



-1 
2 3 



AB = 



-1 -2 
11 4 



B = 



BA = 



1 2 
3 0_ 




3 
-3 


6 




Although the commutative law for multiplication is not valid in matrix arithmetic, many familiar laws of arithmetic are valid 
for matrices. Some of the most important ones and their names are summarized in the following theorem. 



THEOREM 1.4.1 



Properties of Matrix Arithmetic 

Assuming that the sizes of the matrices are such that the indicated operations can be performed, the following rules of 
matrix arithmetic are valid. 



(a) A I B = B I A (Coimmitative law for addition) 



(b) A \-(B+C) = (A \ B) \ C (Associative law for addition) 



(c) ^4 (BC) = (AB) C (As s o dative law for miiltiplic ation) 



(d) A(B I C) = AB + AC (Left distributive law) 

(e) (B I C)A = BA + CA (Right distributive law) 

(f) A{B-C)=AB-AC 

(g) {B - C)A = BA - CA 
(h) a(B \ C)=aB + aC 
(i) a(B-C)=aB-aC 
0) (a I b)C = aC + bC 
(k) ( fl -i)C = flC-iC 
(1) a(bC) = (ab)C 

(m) a(5C) = ( 1 35)C = 5(flC) 



To prove the equalities in this theorem, we must show that the matrix on the left side has the same size as the matrix on the 
right side and that corresponding entries on the two sides are equal. With the exception of the associative law in part (c), the 
proofs all follow the same general pattern. We shall prove part (d) as an illustration. The proof of the associative law, which is 
more complicated, is outlined in the exercises. 



Proof (d) We must show that A(B I C) and AB I AC have the same size and that corresponding entries are equal. To form 
A(B 4 C), the matrices B and C must have the same size, say mxn, an d the matrix A must then have m columns, so its size 
must be of the form rxm- This makes A(B I C) an r x n matrix. It follows that AB I AC is also an r x n matrix and, 
consequently, A(B I C) and AB I AC have the same size. 

Suppose that A = [a ;,- ] , B = [£,-,• ] , and C = [ch ] ■ We want to show that corresponding entries of A(B I C) and AB f AC are 
equal; that is, 

[A(B\ 0] li =[AB\JiC] li 
for all values of i and j. But from the definitions of matrix addition and matrix multiplication, we have 
[A(B + C) ] ij = an (bij I cij) I a i2 (b 2 j I c 2 j) + - 4 a im (b m j + c m j) 

= (cti\b\j + a i2 b2j \- - + aimbmj) + (flacij I ai2P2j-\ \-^im<=mj) 

= [AB] ii +[AC] ii =[AB I AC]y 



Remark Although the operations of matrix addition and matrix multiplication were defined for pairs of matrices, associative 
laws (b) and (c) enable us to denote sums and products of three matrices as A \ R \ C and ABC without inserting any 
parentheses. This is justified by the fact that no matter how parentheses are inserted, the associative laws guarantee that the 
same end result will be obtained. In general, given any sum or any product of matrices, pairs of parentheses can be inserted or 
deleted anywhere within the expression without affecting the end result. 



EXAMPLE 2 Associativity of Matrix Multiplication 



As an illustration of the associative law for matrix multiplication, consider 

1 2' 
A = 



3 4 
1 



B = 



Then 



,45 = 



[1 2] 




\ 8 5 1 




4 3 






3 4 


2 1 


^^ 


20 13 


1 




2 1 



Thus 



(AB)C = 



and 



4 3 
2 1 



C = 



1 

2 3 



and BC = 



"4 3" 


"1 0" 




"10 9" 


2 1_ 


2 3 




. 4 3 . 



" 8 5" 


"1 0" 
_2 3_ 




"18 15" 


20 13 
2 1 


— 


46 39 
4 3 



A(BC) = 
so (AB)C = A{BC), as guaranteed by Theorem 1.4.1c. 



[1 2] 




[18 15] 




10 9 






3 4 


4 3 


^^ 


46 39 


1 




4 3 



Zero Matrices 

A matrix, all of whose entries are zero, such as 





[0 0] 



































"o" 


0" 










? 













[0] 



is called a zero matrix. A zero matrix will be denoted by 0; if it is important to emphasize the size, we shall write Q mxn for the 

m x» 

zero matrix. Moreover, in keeping with our convention of using boldface symbols for matrices with one column, we will 

denote a zero matrix with one column by 0. 

If A is any matrix and is the zero matrix with the same size, it is obvious that ^4 I = + A = A The matrix plays much 
the same role in these matrix equations as the number plays in the numerical equations a + Q = Q + a = a. 

Since we already know that some of the rules of arithmetic for real numbers do not carry over to matrix arithmetic, it would be 
foolhardy to assume that all the properties of the real number zero carry over to zero matrices. For example, consider the 
following two standard results in the arithmetic of real numbers. 



If ab — ac and a * 0, then b — c- (This is called the cancellation law.) 



If ad = 0> then at least one of the factors on the left is 0. 



As the next example shows, the corresponding results are not generally true in matrix arithmetic. 



EXAMPLE 3 The Cancellation Law Does Not Hold 



Consider the matrices 



You should verify that 



A = 



1 
2 



5 = 



AB = AC = 



1 1 
3 4 

3 4' 
6 8 



C = 



2 5 

3 4 



D = 



3 7 




and AD = 








Thus, although A * ft it is incorrect to cancel the A from both sides of the equation AB = j4C and write B = C- Also, ^^C = C? 
yet ^4 * and D * 0- Thus, the cancellation law is not valid for matrix multiplication, and it is possible for a product of matrices 
to be zero without either factor being zero. 

♦ 

In spite of the above example, there are a number of familiar properties of the real number that do carry over to zero matrices. 
Some of the more important ones are summarized in the next theorem. The proofs are left as exercises. 



THEOREM 1.4.2 



Properties of Zero Matrices 

Assuming that the sizes of the matrices are such that the indicated operations can be performed, the following rules of 
matrix arithmetic are valid. 



(a) 


A+0 = 


+ A = 


■-A 


V w / 










(b) 


A- 


-A = 


--0 












(c) 


0- 


A = 


-A 








0A = 




(d) 


A0 


= 0; 





Identity Matrices 

Of special interest are square matrices with l's on the main diagonal and O's off the main diagonal, such as 



"1 0" 
1_ 


? 


~1 0" 
1 
1 


? 



1 














1 














1 














1 



and so on. 



A matrix of this form is called an identity matrix and is denoted by /. If it is important to emphasize the size, we shall write / 
for the n x n identity matrix. 

If A is an m x n matrix, then, as illustrated in the next example, 

AI n = A and I m A = A 
Thus, an identity matrix plays much the same role in matrix arithmetic that the number 1 plays in the numerical relationships 

a - 1 = 1 -a = a- 



EXAMPLE 4 Multiplication by an Identity Matrix 



Consider the matrix 



Then 



and 



1 2 A = 



AI 3 = 



A = 



a\\ an <*13 
a 2\ a 71 a 2Z 



1 
1 



a\\ a\2 an 
a 2\ fl 22 fl 23 



a\\ a\2 ai3 
a 2\ a 7l a 2Z 



= A 





[1 0] 




an fl 12 fl 13 








1 


= 


L321 ^22 ^23 








1 





an a \2 a \2 
(321 ^22 «23 



= A 



As the next theorem shows, identity matrices arise naturally in studying reduced row-echelon forms of square matrices. 



THEOREM 1.4.3 



IfR is the reduced row -echelon form of an nx.n matrix A, then either R has a row of zeros or R is the identity matrix / . 



Proof Suppose that the reduced row-echelon form of A is 



R = 



ni ni - n K 

m ^22 - ^2h 



Either the last row in this matrix consists entirely of zeros or it does not. If not, the matrix contains no zero rows, and 
consequently each of the n rows has a leading entry of 1. Since these leading l's occur progressively farther to the right as we 
move down the matrix, each of these l's must occur on the main diagonal. Since the other entries in the same column as one of 
these l's are zero, R must be I. Thus, either R has a row of zeros or R — 7. 



DEFINITION 



If A is a square matrix, and if a matrix B of the same size can be found such that AB = BA = L then A is said to be invertible 
and B is called an inverse of A. If no such matrix B can be found, then A is said to be singular. 



EXAMPLE 5 Verifying the Inverse Requirements 



The matrix 



since 



5 = 



and 



3 5 
1 2 



AB = 



BA = 



is an inverse of A — 



3 5 
1 2 



2 -5 
-1 3 



2 


-5" 


"3 5" 




"1 0" 


1 


3_ 


1 2_ 




1_ 



2 -5 
-1 3 



= / 



1 
1 



= ; 



EXAMPLE 6 A Matrix with No Inverse 



The matrix 



is singular. To see why, let 



be any 3x3 matrix. The third column of BA is 



Thus 



A = 



"1 


4 


0" 


2 


5 





3 


6 






5 = 



All 


*12 


&13 


*21 


A 22 


^23 


£ 3 1 


A 32 


£33 _ 



All Ao &13 


"0" 




"0" 


A21 A22 A23 





= 





^31 A32 ^33 











BA±I = 



"1 





0" 





1 











1 



Properties of Inverses 

It is reasonable to ask whether an invertible matrix can have more than one inverse. The next theorem shows that the answer is 



no — an invertible matrix has exactly one inverse. 



THEOREM 1.4.4 



IfB and C are both inverses of the matrix A, then B = C> 



Proof Since B is an inverse of A, we have BA = /. Multiplying both sides on the right by C gives (BA)C = IC = C- But 
(BA)C = 3(AC) = BI = B, so c = B- 



As a consequence of this important result, we can now speak of "the" inverse of an invertible matrix. If A is invertible, then its 
inverse will be denoted by the symbol ^-1. Thus, 

A4 _1 =7 and A~ X A = l 
The inverse of A plays much the same role in matrix arithmetic that the reciprocal fl _1 plays in the numerical relationships 
aa~ l = \ and l3 - 1 ( 3 = l. 

In the next section we shall develop a method for finding inverses of invertible matrices of any size; however, the following 
theorem gives conditions under which a 2 x 2 matrix is invertible and provides a simple formula for the inverse. 



THEOREM 1.4.5 



The matrix 



A = 



a b 
c d 



is invertible if ad-bc*0, in which case the inverse is given by the formula 



A~ l = 


1 


d 
— c 






d 


b 


-b 
a 


= 


ad — be 
c 


ad —be 
a 


ad —be 






ad — be 


ad —be 



Proof We leave it for the reader to verify that AA -1 = 1 2 anc * A -1 A — lj- 



THEOREM 1.4.6 



If A andB are invertible matrices of the same size, then AB is invertible and 



{AB)~ l =B~ l A~ { 



Proof If we can show that (AB) (B l A 1 ) = (B l A 1 ) (AB) = I, then we will have simultaneously shown that the matrix AB 
is invertible and that (AB) ~ l =B~ l A~ l - But (AB) = (B~ l A _1 ) = A(BB~ l )A _1 = AIA ~ l =AA~ l =1. A similar argument 
shows that (B~ l A~ l )(AB) = 1. 



Although we will not prove it, this result can be extended to include three or more factors; that is, 



A product of any number of invertible matrices is invertible, and the inverse of the product is the product of the inverses in 
the reverse order. 



EXAMPLE 7 Inverse of a Product 



Consider the matrices 



,4 = 



1 2 
1 3 



B = 



Applying the formula in Theorem 1.4.5, we obtain 



A~ l = 



Also, 



3 -2 

-1 1 



B~ l A~ l = 



B~ l = 



1 -1 



3 2 
2 2 



1 -1 



3 -2 
-1 1 



AB = 



7 6 
9 8 



(AB)~ l = 



4 -3 

9 7 

2 2 



4 

9 
2 



-3 

7 
2 



Therefore, (Ag) -1 = j g -1 J 4 -1 L as guaranteed by Theorem 1.4.6. 



Powers of a Matrix 



Next, we shall define powers of a square matrix and discuss their properties. 



DEFINITION 




If A is a square matrix, then we define 


5 the nonnegative integer powers of A to be 

n factors 


Moreover, if A is invertible, then we c 


lefine the negative integer powers to be 

A~ n =(A~ l ) n = A~ l A~ l -A~\ 




n factors 



Because this definition parallels that for real numbers, the usual laws of exponents hold. (We omit the details.) 



THEOREM 1.4.7 



Laws of Exponents 

If A is a square matrix and r and s are integers, then 



A T A S = A 7+S , (A r ) = A 7 



The next theorem provides some useful properties of negative exponents. 
THEOREM 1.4.8 



Laws of Exponents 

If A is an invertible matrix, then: 



-1 



j4 _1 ^ invertible and (y4 _1 ) = A 



(b) ^4" is invertible and (A™) = (A~ l ) forn = 0,\,2, 



(c) p or an y nonzero scalar k, the matrix kA is invertible and (kA) = j-A 



Proof 



( a ) Since AA _1 — A _1 A — /, the matrix ^ _1 is invertible and (A~ l ) =A- 

(b) This part is left as an exercise. 

(c) If k is any nonzero scalar, results (/) and (m) of Theorem 1.4.1 enable us to write 

(kA)(±A- l } = ±(kA)A- l = (±k'jAA- 1 = (\)I = I 
Similarly, (j-A~ 1 \kA)=Iso that kA is invertible and (kA)' 1 = t a ~ 1 - 



EXAMPLE 8 Powers of a Matrix 



Let A and j4 -1 be as in Example 7; that is, 



A = 



1 2 
1 3 



and A -1 = 



3 -2 
-1 1 



Then 



A 3 = 



A~ 3 = (A _1 ) = 



"1 2" 


"1 2" 


Tl 2" 




'11 30" 






1 3_ 


_1 3 


_[l 3_ 




15 41_ 




3 -2] 


3 -2" 


3 -2" 




-1 


1 


-1 


1 


-1 


1 





41 -30 
-15 11 



(1) 



Polynomial Expressions Involving Matrices 

If A is a square matrix, say mxm^ an d if 

;?(*) =a$ + a\x-\ \-a n x n 

is any polynomial, then we define 

p(A) = a G I + a\A + - + a H A M 

where / is the m x ws identity matrix. In words, p (A) is the m x ^2 matrix that results when A is substituted for x in 1 and aq is 
replaced by aQ j. 



EXAMPLE 9 Matrix Polynomial 



If 



then 



p(x) = 2*- 3* I 4 and A = 



1 2 
3 



^(A) = 2A^-3A + 4/=2 



1 2 
3 



-3 



1 2 
3 



+ 4 



1 
1 



"2 8" 
_0 18_ 


— 


"-3 6" 
9_ 


1 


"4 0" 
4_ 


= 


"9 2" 
_0 13_ 



Properties of the Transpose 

The next theorem lists the main properties of the transpose operation. 



THEOREM 1.4.9 



Properties of the Transpose 

If the sizes of the matrices are such that the stated operations can be performed, then 



(a) 



((A) s ) =A 



(b) 
(c) 
(d) 














(A 1 B) 


T = A T 


h B T and (A 


-Bf- 


-A T - 


-B T 






here k is any 


scalar 






{kA) 7 = 


--kA T , wi 






(AB) T = 


--B T A T 







If we keep in mind that transposing a matrix interchanges its rows and columns, parts (a), (b), and (c) should be self-evident. 
For example, part (a) states that interchanging rows and columns twice leaves a matrix unchanged; part (b) asserts that adding 
and then interchanging rows and columns yields the same result as first interchanging rows and columns and then adding; and 
part (c) asserts that multiplying by a scalar and then interchanging rows and columns yields the same result as first 
interchanging rows and columns and then multiplying by the scalar. Part (d) is not so obvious, so we give its proof. 



Proof Id) Let ^4 = \aa] and B = \ha 1 so that the products AB and B T A T can both be formed. We leave it for the 
reader to check that (AB) and B T A T have the same size, namely ftxm- Thus it only remains to show that corresponding 
entries of (A3) T and 5 T j[ T are the same; that is, 

((AB)% = (B T A% (2) 

Applying Formula 11 of Section 1.3 to the left side of this equation and using the definition of matrix 
multiplication, we obtain 

((AB) % = (AB)ji = aj { b h I aj 2 b 2 i + ■■■ + a Jr b ri (3) 

To evaluate the right side of 2, it will be convenient to let a' and &' denote the ijth entries of a t and 
B T , respectively, so 

From these relationships and the definition of matrix multiplication, we obtain 

= buaji + b 2 jaj 2 + - + b ri a Jr 
= a jib u + a j2 b2i + - + a^ri 

This, together with 3, proves 2. 

■ 

Although we shall not prove it, part (d) of this theorem can be extended to include three or more factors; that is, 
The transpose of a product of any number of matrices is equal to the product of their transposes in the reverse order. 



Remark Note the similarity between this result and the result following Theorem 1.4.6 about the inverse of a product of 
matrices. 



Invertibility of a Transpose 

The following theorem establishes a relationship between the inverse of an invertible matrix and the inverse of its transpose. 
THEOREM 1.4.10 




Proof We can prove the invertibility of A T and obtain 4 by showing that 



i TV a-U 



,-U T aT 



A'(A~ L ) =(A~ L ) A' =1 
But from part (d) of Theorem 1.4.9 and the fact that / T = j\, we have 

A T (A _1 ) T = (A _1 A) T = I T = I 



\-U T aT 



-u 



(A' 1 ) A' =(AA~ L ) =T =1 



which completes the proof. 



EXAMPLE 1 Verifying Theorem 1 .4.1 



Consider the matrices 



Applying Theorem 1.4.5 yields 



A~ l = 



A = 



-5 -3 
2 1 



1 3 
-2 -5 



1 T 
(A' 1 ) = 



A T = 



1 -2 
3 -5 



-5 2 
-3 1 



r - 1 
(A T ) = 



-2 
-5 



As guaranteed by Theorem 1.4.10, these matrices satisfy 4. 



Exercise Set 1 .4 



O 



Click here for Just Ask! 



Let 



1. 







2 


-1 3 




8 


-3 


A = 





4 5 


5 = 





1 




-2 


1 4 




4 


-7 




"0 


-2 3" 






C = 


1 
3 


7 4 
5 9 


fl = 4. 


£ 


Show 


nh 


lat 











-5 
2 
6 

= -7 



(a) J 4+(S + C) = 0i + 5) + C 



(b) (AB)C = A(BC) 



(c) (a | i)C = aC + iC 



(d) a(B-C) = aB-aC 



Using the matrices and scalars in Exercise 1, verify that 



(a) a(BC) = (aB)C = B(aC) 



(b) A(B - C) = AB - AC 



(c) {B I C)A = BA + CA 



(d) a(bC) = (ab)C 



Using the matrices and scalars in Exercise 1, verify that 



( a ) (A T ) T = A 



(b) (A I B) T = A T + B T 

(c) ( a C) T = aC T 

(d) (AB) T = B T A T 



Use Theorem 1.4.5 to compute the inverses of the following matrices. 



(a) A _ 



3 1 
5 2 



(b) 



B = 



2 -3 
4 4 



(c) 



C = 



6 
-2 



4 
-1 



(d) Z) = 



2 
3 



Use the matrices A and 5 in Exercise 4 to verify that 



( a ) (A' 1 ) l =A 



(b) , n r- 



-l 



-u 



(5 J ) = (£?-') 



Use the matrices A, B, and C in Exercise 4 to verify that 



-1 D-l ,1-1 



(a) {AB)~ l =B~ l A 



-1^-1 D-l a -I 



(b) (ABC)~ 1 = C~ 1 B~ 1 A 



7. 



In each part, use the given information to find A. 



8. 



Let A bg)the n^atrif 2 _ 1 



2 
4 1 



Compute a 3 ? A~ 3 > ai W ^ 2 - 2A 



(7^ = 
Let A be the matriJt 



— _? f 
1 -2 



I- 



(c) 



3 1 
2 1 



part, fi 



_r-3 -i 



In each part, find p (A) • 5 2 



© 



^(x) =* — 2= 



-1 2 
4 5 



(b) p(x) = 2x 2 -x+\ 



(c) p(x) = x 3 -2x + 4 



10. 



het pi (x) =x 2 -9, p 2 (x) = * + 3, and P3 (x) =x-3- 



(a) Show that p j (A) = p 2 {A}p-$ (A) for the matrix A in Exercise 9. 



(b) Show that p j (A) — P2 (A) p 2(A) for any square matrix A. 



11. 



Find the inverse of 

cos0 sin0 
— sin 9 cos 9 



12. 



Find the inverse of 



Consider the matrix 



13. 



A = 



a n - 
a 22 - 



■■■ a 



MM 



14. 



where flnt322 L,Jt3 HH ^ 0* Show that A is invertible and find its inverse. 

Show that if a square matrix A satisfies A 2 — 3A + I = 0, then ^4 _1 = 3/ _ ^ 



15. 



(a) Show that a matrix with a row of zeros cannot have an inverse. 



(b) Show that a matrix with a column of zeros cannot have an inverse. 



16. 



Is the sum of two invertible matrices necessarily invertible? 



17. 



Let A and B be square matrices such that AB=0- Show that if A is invertible, then B=G- 



18. 



Let A, B, and be 2 x 2 matrices. Assuming that A is invertible, find a matrix C such that 



~A~ l 


' 


c 


A -1 



is the inverse of the partitioned matrix 



A 





B 


A 



(See Exercise 15 of the preceding section.) 



19. 



Use the result in Exercise 18 to find the inverses of the following matrices. 



(a) 



1 1 

1 1 

1 1 

1 1 






o" 








1 


1 


-1 


1 



(b) 



1 


1 


o" 





1 











1 1 








1 



20. 



(a) Find a nonzero 3x3 matrix A such that A T — ^4. 



(b) Find a nonzero 3 x 3 matrix A such that A T = —A- 



21. 



A square matrix A is called symmetric if A T = A and skew-symmetric if A T = —A- Show that if B is a square matrix, then 



( a ) BB T an d B + B T are symmetric 



(b) B — B T is skew- symmetric 



22. 



T T ^ 

If A is a square matrix and n is a positive integer, is it true that (A n ) = (A ) ? Justify your answer. 



Let A be the matrix 



23. 



"1 





f 


1 


1 








1 


1 



Determine whether A is invertible, and if so, find its inverse. 



Hint Solve AX = I by equating corresponding entries on the two sides. 



Prove: 
24. 



25. 



(a) part (b) of Theorem 1.4.1 

(b) part (/) of Theorem 1.4.1 

(c) part (m) of Theorem 1.4.1 

Apply parts (d) and (m) of Theorem 1.4.1 to the matrices A, 5, and ( _ \)C to derive the result in part (/). 



Prove Theorem 1.4.2. 
26. 



Consider the laws of exponents A r A 5 = A r+5 and (A r ) = A rs . 
27. 



(a) Show that if A is any square matrix, then these laws are valid for all nonnegative integer values of r and s. 

(b) Show that if A is invertible, then these laws hold for all negative integer values of r and s. 

Show that if A is invertible and k is any nonzero scalar, then (kA) n = k n A n for all integer values of n. 
28. 



29. 

(a) Show that if A is invertible and AB = AC, then B = C- 



(b) Explain why part (a) and Example 3 do not contradict one another. 

Prove part (c) of Theorem 1.4.1. 
30. 

Hint Assume that A is m x n,B is n x P, and C is p x q. The ijth entry on the left side is 
Ijj = an [BC] \j I flj-2 [BC] 2j + - + fl] H [SC] HJ - and the jjth entry on the right side is 
^■=[A5] 3l ^ I [ J 45] I . 2C2j . + .«4 [AB] ip c pJ . Verify that i 2/ = ^-. 



D/scuss/on 

OV p^„ Let A and 5 be square matrices with the same size. 



31. 



2_ a2 



(a) Give an example in which (A I B) ±A + 2AB + B . 



(b) Fill in the blank to create a matrix identity that is valid for all choices of A and B. 

(A + B) 2 = A 2 + B 2 + . 



Let A and B be square matrices with the same size. 
32. 



2 D 2 



(a) Give an example in which (A l B)(A — B)*A—B 



(b) Let A and B be square matrices with the same size. Fill in the blank to create a matrix 
identity that is valid for all choices of A and B. (A I B) (A — B) = • 



In the real number system the equation fl 2 = ] has exactly two solutions. Find at least eight 
33- different 3x3 matrices that satisfy the equation j[ 2 — jy 

Hint Look for solutions in which all entries off the main diagonal are zero. 



A statement of the form "If/?, then q" is logically equivalent to the statement "If not q, then not 
34. p." (The second statement is called the logical contrapositive of the first.) For example, the 
logical contrapositive of the statement "If it is raining, then the ground is wet" is "If the ground 
is not wet, then it is not raining." 



(a) Find the logical contrapositive of the following statement: If ^ T is singular, then A is 
singular. 

(b) Is the statement true or false? Explain. 



Let A and B be M x « matrices. Indicate whether the statement is always true or sometimes false. 
35. Justify each answer. 



(a) 


(AB) 


2 = A 2 B 2 












(b) 


(A- 


B) 2 = (B- 


■A) 2 










(c) 


(AB' 


- 1 )(^" 1 ) 


= ln 



(d) AB±BA 



Assuming that all matrices are n x « and invertible, solve for D. 
36 ' ABC T DBA T C = A3 T 
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1.5 

ELEMENTARY MATRICES 
AND A METHOD FOR 
FINDING a- 1 



In this section we shall develop an algorithm for finding the inverse of an 
invertible matrix. We shall also discuss some of the basic properties of 
invertible matrices. 



We begin with the definition of a special type of matrix that can be used to carry out an elementary row operation by matrix 
multiplication. 



DEFINITION 



An^x« matrix is called an elementary matrix if it can be obtained from the n x n identity matrix / by performing a single 
elementary row operation. 



EXAMPLE 1 Elementary Matrices and Row Operations 



Listed below are four elementary matrices and the operations that produce them. 









10 


1 


0" 




1 





-3 




10 








10 





"1 3" 




"1 0" 




1 




1 




1 




1 



Multiply the 


Interchange the 


Add 3 times 


Multiply the 


second row of 


second and fourth 


the third row of 


first row of 


hhy-3. 


rows of ^4 . 


^3 to the first row. 


/ 3 byl. 



When a matrix A is multiplied on the left by an elementary matrix E, the effect is to perform an elementary row operation on A. 
This is the content of the following theorem, the proof of which is left for the exercises. 



THEOREM 1.5.1 



Row Operations by Matrix Multiplication 

If the elementary matrix E results from performing a certain row operation on J and if A is an wzxh matrix, then the product 
EA is the matrix that results when this same row operation is performed on A. 



EXAMPLE 2 Using Elementary Matrices 



Consider the matrix 



A = 



2 3 

■13 6 

4 4 



and consider the elementary matrix 



B = 



"1 





0" 





1 





3 





1 



which results from adding 3 times the first row of / 3 to the third row. The product EA is 

1 2 3" 
EA = 



■13 6 
4 10 9 



which is precisely the same matrix that results when we add 3 times the first row of A to the third row. 



Remark 

Theorem 1.5.1 is primarily of theoretical interest and will be used for developing some results about matrices and systems of 
linear equations. Computationally, it is preferable to perform row operations directly rather than multiplying on the left by an 
elementary matrix. 

If an elementary row operation is applied to an identity matrix / to produce an elementary matrix E, then there is a second row 
operation that, when applied to E, produces / back again. For example, if E is obtained by multiplying the Ufa row of / by a 
nonzero constant c, then /can be recovered if the Ufa row of E is multiplied by 1 / c . The various possibilities are listed in Table 1. 
The operations on the right side of this table are called the inverse operations of the corresponding operations on the left. 

Table 1 



Row Operation on / That Produces E Row Operation on E That Reproduces / 



Multiply row i by c * 



Multiply row i by 1 / c 



Interchange rows i and j 



Interchange rows i and j 



Add c times row i to row j 



Add — c times row i to row j 



EXAMPLE 3 Row Operations and Inverse Row Operations 



In each of the following, an elementary row operation is applied to the 2x2 identity matrix to obtain an elementary matrix E, 
then E is restored to the identity matrix by applying the inverse row operation. 



1 




1 




1 


[o lj 




L° 7 J 




[o lj 



T T 

Multiply the Multiply the 
second row second row 



by 7. 


by^. 




"1 0" 
1 


— 


"0 1" 
1 0_ 


— 


"1 0" 
1_ 



T 



T 



Interchange the Interchange the 

first and second first and second 
rows. rows. 



"1 0" 




"1 5" 




"1 0" 


L° i \ 




[o lj 




[o lj 



I I 

Add 5 times Add — 5 times 

the second row the second row 

to the first. to the first. 



The next theorem gives an important property of elementary matrices. 



THEOREM 1.5.2 



Every elementary matrix is invertible, and the inverse is also an elementary matrix. 



Proof If E is an elementary matrix, then E results from performing some row operation on /. Let Eq be the matrix that results 
when the inverse of this operation is performed on /. Applying Theorem 1.5.1 and using the fact that inverse row operations 
cancel the effect of each other, it follows that 

E [} E = 1 and EE$ = I 

Thus, the elementary matrix Eq is the inverse of E. 

■ 

The next theorem establishes some fundamental relationships among invertibility, homogeneous linear systems, reduced 
row-echelon forms, and elementary matrices. These results are extremely important and will be used many times in later sections. 



THEOREM 1.5.3 



Equivalent Statements 

If A is an n x n matrix, then the following statements are equivalent, that is, all true or all false. 
(a) A is invertible. 



(b) Ax = has only the trivial solution. 



(c) The reduced row-echelon form of A is [ . 



(d) A is expressible as a product of elementary matrices. 



Proof We shall prove the equivalence by establishing the chain of implications: {a) ^± (b) ^± (c) ^± (d) ^± (a). 

(a) => (b) Assume A is invertible and let xq be any solution of Ax = 0; thus Axq = 0- Multiplying both sides of this equation by 
the matrix A~ { gives A~ l (Axq) = A~ { 0, or (A~ { A)xq = 0, or / XQ = Q, or XQ = Q. Thus, Ax = has only the trivial 
solution. 



(b) => (c) Let Ax = be the matrix form of the system 



a 2 \x\ + ^22*2 H \~^2n^n = 

a^ix l + a H 2*2 + """ + ^hh*h = 



(1) 



and assume that the system has only the trivial solution. If we solve by Gauss-Jordan 
elimination, then the system of equations corresponding to the reduced row-echelon form of 
the augmented matrix will be 



*i 



*2 



= 



(2) 



Thus the augmented matrix 



for 1 can be reduced to the augmented matrix 



an 


312 


<*2\ 


^ 22 


a„l 


«h2 


"l 








1 





1 



a i n 


o" 


^2h 





<2nn 
















1 



for 2 by a sequence of elementary row operations. If we disregard the last column (of zeros) in each of these matrices, 
we can conclude that the reduced row-echelon form of A is / . 



(c) => (d) Assume that the reduced row-echelon form of A is / , so that A can be reduced to / by a finite sequence of elementary 
row operations. By Theorem 1.5.1, each of these operations can be accomplished by multiplying on the left by an 
appropriate elementary matrix. Thus we can find elementary matrices s^ E 2 ^ • • •> E^ suc b that 



E k -E 2 E l A = I n (3) 

By Theorem 1.5.2, E\, E 2 , ■■■/ E k are invertible. Multiplying both sides of Equation 3 on the left 
successively by 5-1, ..., jj-i, 5-1 we obtain 

A = E { E 2 -E k In = E { E 2 -E k ^ 

By Theorem 1.5.2, this equation expresses A as a product of elementary matrices. 

(d) => (a) If A is a product of elementary matrices, then from Theorems Theorem 1.4.6 and Theorem 1.5.2, the matrix A is a 
product of invertible matrices and hence is invertible. 



Row Equivalence 

If a matrix B can be obtained from a matrix A by performing a finite sequence of elementary row operations, then obviously we 
can get from B back to A by performing the inverses of these elementary row operations in reverse order. Matrices that can be 
obtained from one another by a finite sequence of elementary row operations are said to be row equivalent. With this 
terminology, it follows from parts (a) and (c) of Theorem 3 that an ^ x h matrix A is invertible if and only if it is row equivalent to 
the ^ x n identity matrix. 

A Method for Inverting Matrices 

As our first application of Theorem 3, we shall establish a method for determining the inverse of an invertible matrix. 
Multiplying 3 on the right by ^4 _1 yields 

A" l =B k ^B 2 SiI 71 (5) 

which tells us that ^4 _1 can be obtained by multiplying / successively on the left by the elementary matrices g^ jj 2 , • • •, E^- 
Since each multiplication on the left by one of these elementary matrices performs a row operation, it follows, by comparing 
Equations 3 and 5, that the sequence of row operations that reduces A to J will reduce J n to j{~K Thus we have the following 
result: 



To find the inverse of an invertible matrix A, we must find a sequence of elementary row operations that reduces A to the 
identity and then perform this same sequence of operations on J to obtain j[' 



-1 



A simple method for carrying out this procedure is given in the following example. 



EXAMPLE 4 Using Row Operations to Find ^ _1 



Find the inverse of 

A = 



"1 


2 


3" 


2 


5 


3 


1 





8 



Solution 



We want to reduce A to the identity matrix by row operations and simultaneously apply these operations to / to produce A 1 . To 
accomplish this we shall adjoin the identity matrix to the right side of A, thereby producing a matrix of the form 

[AUl 
Then we shall apply row operations to this matrix until the left side is reduced to /; these operations will convert the right side to 
j4 _1 , so the final matrix will have the form 



The computations are as follows: 



1 


2 


3 


: 


.^ 


3 


t 





8 



[7L4" 1 ] 



1 


n 


o" 


I) 


I 








u 


I 



1 


2 





1 





-2 


1 


j 





1 








[ 


2 





1 









3 

-3 
5 

3 

-3 

... _] 

3 
-3 

I 



_? 

I 

l 
_2 

-5 

I 

■> 

5 



u 
l 




I 

2 








I 

o 

i 





I 
-2 -1 



\\i- ujdftd -2 firm:* the first 

!■.■■.'. m Hie *ccuiiJ .in J I times 
llic IifM riT.v m ;liu riiiriL 



Wi h adtti:d2 lime* the 
AiauilJ PQW Eu the iJiird. 



W* riiirlliplicJ tlir 
iliii.l nm tft I 



[ 


2 








1 











J 



-14 6 3 

13 -5 -3 

5 -2 -1 



Wl" ajtM .^ lirrsuslliL" lliiiil 
npfw li,s iSiL- ■>.:, mi I cUhd - 1 dffici 
i he HiikI reft (0 ihc firef. 



Thus, 



1 


(.) 








1 











) 



-40 16 9 

13 -5 -3 

5 -2 -! 



,4-' = 



tt%- UiStti^l —2 tiros &e 



-40 16 9 

13 -5 -3 

5-2-1 



Often it will not be known in advance whether a given matrix is invertible. If an n x n matrix A is not invertible, then it cannot be 
reduced to / by elementary row operations [part (c) of Theorem 3]. Stated another way, the reduced row-echelon form of A has 
at least one row of zeros. Thus, if the procedure in the last example is attempted on a matrix that is not invertible, then at some 
point in the computations a row of zeros will occur on the left side. It can then be concluded that the given matrix is not 
invertible, and the computations can be stopped. 



EXAMPLE 5 Showing That a Matrix Is Not Invertible 



Consider the matrix 











1 


6 








A = 


2 4 
-1 2 


nple 4 yields 










1 6 


4 


1 


i) 


o" 




2 4 


-] 





1 







-1 2 


5 





it 


1 




1 6 


4 


1 





o" 




-H ■ 


-9 


-2 


1 







8 


9 


1 


i) 


1 




1 6 


4 


1 





(f 




-8 


-9 


-2 


1 







(} 





-1 


1 


] 





4 

-1 

5 



Wc attowl -2 limes rhc tin.1 

fllU III I [if H\'4H1«1 .11il| I i Ilk- i I 

ill-.- lir.oi raw [M ihi: [Iiird. 



Wcaddwhhc 
m . and n)* io 
■ ! iv limit 



Since we have obtained a row of zeros on the left side, A is not invertible. 



EXAMPLE 6 A Consequence of Invertibility 



In Example 4 we showed that 



A = 



"1 


2 


3~ 


2 


5 


3 


1 





8 



is an invertible matrix. From Theorem 3, it follows that the homogeneous system 

*l I 2*2 I 3*3 = 
2*1 + 5*2 + 3*3 = 
*l +3*3 = 

has only the trivial solution. 



Exercise Set 1 .5 



o 



Click here for Just Ask! 



Which of the following are elementary matrices? 



(a) 



1 
-5 1 



(b) 



-5 1 
1 



(c) 



"1 0" 
j/3 



(d) 



"0 





f 





1 





1 









(e) 



"1 


1 


0" 








1 












(f) 



"1 





0" 





1 


9 








1 



(g) 



2 








2~ 





1 














1 














1 



Find a row operation that will restore the given elementary matrix to an identity matrix. 



(a) 



1 
3 1 



(b) 



"1 





0" 





1 











3 



(c) 












f 





1 














1 





1 












(d) 



1 

1 







1 





7 










1 








1 



Consider the matrices 



A = 



3 


4 


1 




2 


-7 


-1 


B = 


8 


1 


5 





8 


1 


5 




2 


-7 


-1 


c= 


3 


4 


1 





3 


4 


1 


2 


-7 


-1 


2 


-7 


3 



Find elementary matrices g^, E^> Ey and E4 sucn inat 

(a) E X A = B 

(b) £35 = ^ 

(c) E 3 A = C 

(d) £40 = ^ 

In Exercise 3 is it possible to find an elementary matrix E such that EB — C? Justify your answer. 

If a 2 x 2 matrix is multiplied on the left by the given matrices, what elementary row operation is performed on that matrix? 



(a) 



1 

1 



(b) 



2 
-3 



(c) 



1 
-2 1 



In Exercises 6-8 use the method shown in Examples Example 4 and Example 5 to find the inverse of the given matrix if the 
matrix is invertible, and check your answer by multiplication. 



6. 



(a) 



1 4 

2 7 



(b) 



-3 6 
4 5 



(c) 



6 -4 
3 2 



(a) 



3 4-1 

1 3 

2 5-4 



(b) 



-1 3 

2 4 

-4 2 



-4 

1 

-9 



(c) 



"1 





f 





1 


1 


1 


1 






(d) 



2 6 6 
2 7 6 
2 7 7 



(e) 



1 1 
1 1 1 
1 



(a) 



1 


1 


5 


5 


1 


1 


5 


5 


1 
5 


4 
5 



_ 2 

5 

J_ 

10 

J_ 
10 



(b) 



i[2 3/2 


-4/2 |/2 


1 



(c) 



10 

13 

13 5 

13 5 7 



(d) 



-8 17 2 

4 I 



-1 13 4 



1 
3 

-9 


2 



(e) 






2 





1 





1 





-1 3 





2 


1 5 


-3 



Find the inverse of each of the following 4x4 matrices, where £j, k 2 > ^3' k# anc * ^ are a ^ nonzero - 



(a) 



*1 








" 





k 2 














*3 














£4 



(b) 












*f 








h 








£3 








£4 












(c) 



k 








0" 


1 


k 











1 


k 











1 


k 



Consider the matrix 



10. 



A = 



1 
-5 2 



(a) Find elementary matrices E\ and £ 2 sucn that E2E\A = I- 



(b) Write j4 _1 as a product of two elementary matrices. 



(c) Write A as a product of two elementary matrices. 



11. 



In each part, perform the stated row operation on 



2 


-1 





4 


5 


-3 


1 


-4 


7 



by multiplying A on the left by a suitable elementary matrix. Check your answer in each case by performing the row 
operation directly on A. 



(a) Interchange the first and third rows. 



(b) Multiply the second row by ^. 



(c) Add twice the second row to the first row. 



Write the matrix 



12. 



3 -2 
3 -1 



as a product of elementary matrices. 

Note There is more than one correct solution. 



Let 



13. 



1 -2 
4 3 
1 



(a) Find elementary matrices E\, E^? anc * £3 sucn mat E2E2E1A = 1 2- 



(b) Write A as a product of elementary matrices. 



14. 



Express the matrix 



A = 



17 8 

1 3 3 8 

2 -5 1 -8 



in the form A = EF GR> where E, F, and G are elementary matrices and R is in row-echelon form. 



Show that if 



15. 



"1 


0" 


1 





a b 


c 



A = 



is an elementary matrix, then at least one entry in the third row must be a zero. 



Show that 



16. 



A = 






a 








0" 


b 





c 











d 





e 











/ 





s 











h 






17. 



is not invertible for any values of the entries. 

Prove that if A is an m x m matrix, there is an invertible matrix C such that CA is in reduced row-echelon form. 



18. 



Prove that if A is an invertible matrix and B is row equivalent to A, then B is also invertible. 



19. 



(a) Prove: If A and B are mxn matrices, then A and B are row equivalent if and only if A and B have the same reduced 
row-echelon form. 



(b) Show that A and B are row equivalent, and find a sequence of elementary row operations that produces B from A. 



,4 = 



"1 2 3" 




1 4 1 


B = 


2 1 9 





1 5 

2-2 

1 1 4 



Prove Theorem 1.5.1. 



20. 



Discussion 
DisoovBry 



Suppose that A is some unknown invertible matrix, but you know of a sequence of elementary row 
21. operations that produces the identity matrix when applied in succession to A. Explain how you can 
use the known information to find A. 



Indicate whether the statement is always true or sometimes false. Justify your answer with a 
22. logical argument or a counterexample. 



(a) Every square matrix can be expressed as a product of elementary matrices. 



(b) The product of two elementary matrices is an elementary matrix. 



(c) If A is invertible and a multiple of the first row of A is added to the second row, then the 
resulting matrix is invertible. 



(d) If A is invertible and AB = Q> then it must be true that B = 0- 



Indicate whether the statement is always true or sometimes false. Justify your answer with a 
23. logical argument or a counterexample. 



(a) If A is a singular n x n matrix, then j\x = has infinitely many solutions. 



(b) If A is a singular n x n matrix, then the reduced row-echelon form of A has at least one row 
of zeros. 



(c) If A _1 is expressible as a product of elementary matrices, then the homogeneous linear 
system j\x = has only the trivial solution. 



(d) If A is a singular n x n matrix, and B results by interchanging two rows of A, then B may or 
may not be singular. 



24. 



Do you think that there is a 2 x 2 matrix A such that 



a b 
c d 



b d 
a c 



for all values of a, b, c, and dl Explain your reasoning. 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



1.6 

F U RT HER R E S U LTS ON ln this section we shall establish more results about systems of linear equations 

S YSTE M S O F ancl invertliDlll ty of matrices. Our work will lead to a new method for solving n 

.-~..a-^ t ~.v.^ A-v.r-v equations in n unknowns. 
EQUATIONS AND 

INVERTIBILITY 



A Basic Theorem 

In Section 1.1 we made the statement (based on Figure 1.1.1) that every linear system has no solutions, or has one solution, or has 
infinitely many solutions. We are now in a position to prove this fundamental result. 

THEOREM 1.6.1 



Every system of linear equations has no solutions, or has exactly one solution, or has infinitely many solutions. 



Proof If Ax: = b is a system of linear equations, exactly one of the following is true: (a) the system has no solutions, (b) the system 
has exactly one solution, or (c) the system has more than one solution. The proof will be complete if we can show that the system 
has infinitely many solutions in case (c). 

Assume that Ay = b has more than one solution, and let xq = x\ — *2> where x\ and X2 are any two distinct solutions. Because xi 
and X2 are distinct, the matrix xq is nonzero; moreover, 

Axq = A x l - *2) = -dxi - -4x2 = b - b = 

If we now let k be any scalar, then 

j4(xi + kx$) = j4xi 4- j4(**o) = Aki + £G4xo) 

= b + £0 = b + 0=b 

But this says that Xl 4. £ XQ is a solution of Jx = b- Since xq is nonzero and there are infinitely many choices for k, the system 

Ax. = b has infinitely many solutions. 

■ 

Solving Linear Systems by Matrix Inversion 

Thus far, we have studied two methods for solving linear systems: Gaussian elimination and Gauss-Jordan elimination. The 
following theorem provides a new method for solving certain linear systems. 

THEOREM 1.6.2 



l-ll 



Proof Since A(A h) = b, it follows that x = A~ b is a solution of Ax = b- To show that this is the only solution, we will assume 
that xq is an arbitrary solution and then show that xq must be the solution A~^h- 

If xq is any solution, then Akq = b- Multiplying both sides by ^4 _1 , we obtain XQ — A~^b- 



EXAMPLE 1 Solution of a Linear System Using /\ 



-1 



Consider the system of linear equations 

*1 + 2*2 + 3*3 = 5 

2*i + 5*2 + 3*3 = 3 

xi +3*3 = 17 

In matrix form this system can be written as Ax = b> where 



,4 = 



In Example 4 of the preceding section, we showed that A is invertible and 

-40 16 9 



"l 2 3" 




"*l" 




' 5' 


2 5 3 


x = 


*2 


h = 


3 


1 8 




X3 




17 



A~ l = 



13 -5 -3 
5-2-1 



By Theorem 1.6.2, the solution of the system is 



x = ^ _1 b = 



40 


16 


9" 


" 5" 




r 


13 


-5 


-3 


3 


= 


-i 


5 


-2 


-1 


17 




2 



or 



xi = h *2 = — 1' *3 = 2- 



Remark 

Note that the method of Example 1 applies only when the system has as many equations as unknowns and the coefficient matrix is 
invertible. This method is less efficient, computationally, than Gaussian elimination, but it is important in the analysis of equations 
involving matrices. 

Linear Systems with a Common Coefficient Matrix 

Frequently, one is concerned with solving a sequence of systems 

J 4x = bi ? Ax = b2, Ax = hi,..., ^4x = b^ 
each of which has the same square coefficient matrix A. If A is invertible, then the solutions 

xi =^4 -1 bi, X2=^ -1 b2, x 3 = ,4" 1 b3 ? ..., x k = A~ l h k 

can be obtained with one matrix inversion and k matrix multiplications. Once again, however, a more efficient method is to form the 
matrix 



[^IbilbsHbfc] 



(1) 



in which the coefficient matrix A is "augmented" by all k of the matrices 1^, b 2 , . . ., b^, and then reduce 1 to reduced row-echelon 
form by Gauss-Jordan elimination. In this way we can solve all k systems at once. This method has the added advantage that it 
applies even when A is not invertible. 



EXAMPLE 2 Solving Two Linear Systems at Once 



Solve the systems 

(a) xi I 2*2 I 3* 3 = 4 
2*i + 5*2 + 3*3 = 5 

*1 +8*3 = 9 

(b) *! | 2*2 I 3*3= 1 
2*i I 5*2 I 3*3 = 6 

*1 4- 8*3 = — 6 



Solution 

The two systems have the same coefficient matrix. If we augment this coefficient matrix with the columns of constants on the right 
sides of these systems, we obtain 



1 


1 


3 


4 


1 


2 


5 


3 


5 


6 


l 





3 


9 


-6 



Reducing this matrix to reduced row-echelon form yields (verify) 



l 








] 


2 





1 








1 





() 


1 


1 


-1 



It follows from the last two columns that the solution of system (a) is jq — ], X j = 0> *3 = 1 an d the solution of system (b) is x ^ — 2, 

* 2 = 1, *3= - I- 

Properties of Invertible Matrices 

Up to now, to show that an n x n matrix A is invertible, it has been necessary to find an n x n matrix B such that 

A3 = / and BA = I 

The next theorem shows that if we produce an n x n matrix B satisfying either condition, then the other condition holds 
automatically. 



THEOREM 1.6.3 



Let Abe a square matrix. 














(a) IfB is a square matrix satisfying BA - 


= 1, 


then 


B = 


= A~ 


-1 


















(b) IfB is a square matrix satisfying j±B - 


= 1, 


then 


B = 


= A~ 


-1 



















We shall prove part (a) and leave part (b) as an exercise. 



Proof (a) Assume that BA = I- If we can show that A is invertible, the proof can be completed by multiplying BA = I on both sides 
by j4 _1 to obtain 

BAA~ { =IA~ { or BI = IA~ { or B = A~ { 

To show that/4 is invertible, it suffices to show that the system Ax = has only the trivial solution (see 
Theorem 3). Let x be any solution of this system. If we multiply both sides of j4x =o on the left by B, we 

obtain bAxq = B0 or /xq=0 or x = 0- Thus, the system of equations ^ = has only the trivial solution. 

■ 

We are now in a position to add two more statements that are equivalent to the four given in Theorem 3. 
THEOREM 1.6.4 



Equivalent Statements 

If A is an ftxn matrix, then the following are equivalent. 

(a) A is invertible. 

(b) Ax = has only the trivial solution. 

(c) The reduced row-echelon form of A is [ . 

(d) A is expressible as a product of elementary matrices. 

(e) Ax = h is consistent for every n x 1 matrix b. 

(f) Ax = h has exactly one solution for every MX 1 matrix b. 



Proof Since we proved in Theorem 3 that (a), (b), (c), and (d) are equivalent, it will be sufficient to prove that (a) ^± (f) ^± (e) ^± 
(a). 

(a) => if) This was already proved in Theorem 1.6.2. 

if) => (e) This is self-evident: If Ax = h has exactly one solution for every n x 1 matrix b, then Ax = h^ consistent for every n x 1 
matrix b. 

(e) => (a) If the system Ax = h^ consistent for every n x 1 matrix 6, then in particular, the systems 



Ax = 



Ax = 




1 






Ax = 



are consistent. Let x\, x 2 , ..., x n be solutions of the respective systems, and let us form an ^ x ^ 
matrix C having these solutions as columns. Thus C has the form 

C=[xilx 2 Hx H ] 

As discussed in Section 1.3, the successive columns of the product AC will be 

Ax], Ax^-.-tAxyj 
Thus 



AC= [Axi \Ax 2 \-\Ax n ] = 



10-0 
1-0 
0-0 



0-1 
By part (b) of Theorem 1.6.3, it follows that C = j4 _1 - Thus, A is invertible. 



= / 



We know from earlier work that invertible matrix factors produce an invertible product. The following theorem, which will be 
proved later, looks at the converse: It shows that if the product of square matrices is invertible, then the factors themselves must be 
invertible. 



THEOREM 1.6.5 



Let A and B be square matrices of the same size. If AB is invertible, then A and B must also be invertible. 



In our later work the following fundamental problem will occur frequently in various contexts. 



A Fundamental Problem: Let A be a fixed m x m matrix. Find all m x 1 matrices b such that the system of equations Ax = h is 
consistent. 



If A is an invertible matrix, Theorem 1.6.2 completely solves this problem by asserting that for every mxl matrix b, the linear 
system Ax = h has the unique solution x — A~^b- If ^ is not square, or if A is square but not invertible, then Theorem 1.6.2 does not 
apply. In these cases the matrix b must usually satisfy certain conditions in order for Ax = h to be consistent. The following example 
illustrates how the elimination methods of Section 1.2 can be used to determine such conditions. 



EXAMPLE 3 Determining Consistency by Elimination 



What conditions must £ 1? £ 2 > an d 63 satisfy in order for the system of equations 

x\ + *3=£2 

2*i +*2 + 3*3 =£3 
to be consistent? 



Solution 

The augmented matrix is 



1 


1 


2 


h] 


1 





1 


h 


2 


1 


3 


h 



which can be reduced to row-echelon form as follows: 



1 1 2 b { 
_1 _1 b 2 -bi 
_l _l b 3 -2bi 



2 b { 
1 bi-b 2 



-l -l b 3 -2bi 

112 bi 

1 1 b { -b 2 

b 3 -b 2 -bi 



— 1 times the first row was 
added to the second and — 2 
times the first row was added 
to the third. 

The second row was 
multiplied by — 1. 



The second row was 
added to the third. 



It is now evident from the third row in the matrix that the system has a solution if and only if £ 1? b 2 , and b 3 satisfy the condition 

b 3 — b 2 — b\ = or b 3 = b\ I b 2 
To express this condition another way, Ax = b is consistent if and only if b is a matrix of the form 

*1 



b = 



h 

bi+b 2 



where b\ and b 2 are arbitrary. 



EXAMPLE 4 Determining Consistency by Elimination 



What conditions must £ 1? b 2 , and b 3 satisfy in order for the system of equations 

x\ + 2^2 + 3^3 =b\ 
2x\ + 5^2 + 3^3 =^2 
K] +Sx? l =b? l 

to be consistent? 



Solution 

The augmented matrix is 



1 


2 


3 


h] 


2 


5 


3 


h 


1 





8 


h 



Reducing this to reduced row-echelon form yields (verify) 



In this case there are no restrictions on £j, £ 2 » an( ^ &3» that is, the given system A\: = b has the unique solution 

*l=-40£i I 16A 2 I 9 h, X2 = J l3bi-5b 2 -3b 3j * 3 = 5b \ - 2b 2 -b 3 
for all b. 



Remark Because the system Ax = h in the preceding example is consistent for all b, it follows from Theorem 1.6.4 that A is 
invertible. We leave it for the reader to verify that the formulas in (3) can also be obtained by calculating x = A~^h- 



Exercise Set 1 .6 



O- 



Click here for Just Ask! 



In Exercises 1-8 solve the system by inverting the coefficient matrix and using Theorem 1.6.2. 



1. 



2. 



*1 + *2 = 2 
5*i I 6*2 = 9 

4*i-3*2= -3 
2*i-5*2= 9 



*l + 3*2 + *3= 4 
3. 

2*i I 2*2 I *3 = — 1 

2*1 + 3*2 + *3 = 3 

5*i + 3*2 I 2*3=4 
4. 

3*i I 3*2 I 2*3 = 2 

*2+ *3 = 5 

*+7+ z = 5 

5 - * I y-4z= 10 
-4* -\-y-\- z = 

- * - 2^ - 3z = 

6 - w + * + 4y + 4z= 10 
w + 3* + 7y + 9z = 4 

— w — 2x — 4y — 6z = & 



7. 



8. 



9. 



3*i + 5*2 = £i 
*1 +2*2=^2 

*l + 2*2 + 3*3 =b\ 
2*i I 5*2 I 5*3 =^2 
3*1 I 5*2 I 8*3 = £3 

Solve the following general system by inverting the coefficient matrix and using Theorem 1.6.2. 

*l I 2*2 I 3*3 = b\ 
*1 - *2 I *3=*2 
*1 + *2 =i?3 



Use the resulting formulas to find the solution if 

(a) b x = -1,6 2 = 3,6 3 =4 

(b) b { =5,b 2 = 0,b 3 = 

(c) i 1= -hb 2 = -hb 3 = 3 

Solve the three systems in Exercise 9 using the method of Example 2. 
10. 

In Exercises 11-14 use the method of Example 2 to solve the systems in all parts simultaneously. 

3*i I 2x 2 =b 2 

(a) bi = hb 2 =4 

(b) bi= -%b 2 = 5 

— x\ +4*2+ *3=£l 

*1 I 9*2-2*3=62 

6*1 I 4*2 — 8*3 =&3 

(a) bi=(hb 2 = \,b 2 = § 

(b) 6 3= _ 3 ,£ 2 = 4 ,£ 3= _ 5 



4*i — 7*2 =b\ 
13. 

*1 + 2*2=i 2 



14. 



(a) i 1= o,i 2 = l 

(b) i 1= _ 4 ,6 2 = 6 

(c) i 1= _ i,i 2 = 3 

(d) i 1= _ 5, i 2 = i 



*l I 3*2 I 5*3 = ii 

-*l-2*2 =£2 

2*i I 5*2 I 4*3 = £3 



(a) a 1 = i,a 2 = o,6 3 = -1 

(b) a 1= o,6 2 = l,6 3 = l 

(c) 61 = -1,69= -1,6^ = 



The method of Example 2 can be used for linear systems with infinitely many solutions. Use that method to solve the systems 
15. in both parts at the same time. 



(a) xi -2* 2 4- *3= -2 
2*i — 5*2 4- *3 = 1 
3*i — 7*2 4 2*3 = — 1 

(b) *i- 2*2 4- *3= 1 
2*i — 5*2 4- *3 = — 1 
3*i —7*2 4 2*3 = 

In Exercises 16-19 find conditions that the b's must satisfy for the system to be consistent. 

6*1 -4*2=£i 

3*i -2*2=^2 

*l — 2*2 I 5*3 = b\ 
17. 

4*i — 5*2 I 8*3 = b2 

— 3*i I 3*2 — 3*3 = ^3 



18. 



*1 -2x2-X3 = b\ 
— 4xi I 5x2 I 2*3=62 
-4*i I 1x2 I 4*3=63 



19. 



x\ — X2 + 3*3 4- 2x4 = b\ 

— 2*i I *2 I 5*3 I *4 = &2 

— 3*1 + 2*2+2*3— *4 = Z?3 
4*i—3*2 I *3 I 3*4 = 64 



Consider the matrices 



20. 



A = 



2 1 2 

2 2-2 

3 1 1 



and 



x = 



*1 

*2 
*3 



(a) Show that the equation Ax = x can be rewritten as (A — I)\ = and use this result to solve Ax = x for *• 



(b) Solve Ax = 4x- 



21. 



Solve the following matrix equation for X. 

2-1 5 7 8 _ 
4 0-301 
3 5-721 



1 


-1 


f 




2 


3 





X = 





2 


-1 





In each part, determine whether the homogeneous system has a nontrivial solution (without using pencil and paper); then state 
22. whether the given matrix is invertible. 



(a) 2*i I *2-3*3 I *4 = 

5*2+4*3 + 3*4=0 

*3 + 2*4= 

3*4 = 



2 1 

5 







■3 1 
4 3 
1 2 
3 



(b) 5*i +*2 + 4*3 + * 4 = 

2*3 — *4= 

*3 + *4= 

7*4=0 



5 1 4 


1 


2 


-1 


1 


1 





7 



Let Ax = be a homogeneous system of n linear equations in n unknowns that has only the trivial solution. Show that if k is 
23. any positive integer, then the system ^ x _ q also has only the trivial solution. 

Let Ax = be a homogeneous system of n linear equations in n unknowns, and let Q be an invertible n x n matrix. Show that 
^* j4x = has just the trivial solution if and only if (QA)x = has just the trivial solution. 



Let Ax = b be any consistent system of linear equations, and let x\ be a fixed solution. Show that every solution to the system 
25. can be written in the form x = x\ I xr> where xg is a solution Ax = 0- Show also that every matrix of this form is a solution. 



26. 



Use part (a) of Theorem 1.6.3 to prove part (b). 



27. 



What restrictions must be placed on x and y for the following matrices to be invertible? 



(a) 



(b) 



(c) 



X 


y 


X 


X 


X 





y 


y 


X 


y 


y 


X 



Disoussion 
DisoovBry 



28. 



(a) If A is an n x n matrix and if b is an n x 1 matrix, what conditions would you impose to ensure 
that the equation x = Ak -h b has a unique solution for xl 



(b) Assuming that your conditions are satisfied, find a formula for the solution in terms of an 
appropriate inverse. 



Suppose that A is an invertible n x n matrix. Must the system of equations j\%_ = x have a unique 
29. solution? Explain your reasoning. 



30. 



Is it possible to have AB = I without B being the inverse of A? Explain your reasoning. 



31. 



Create a theorem by rewriting Theorem 1.6.5 in contrapositive form (see Exercise 34 of Section 1.4). 
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1.7 

DIAGONAL, 
TRIANGULAR, AND 
SYMMETRIC MATRICES 



In this section we shall consider certain classes of matrices that have special 
forms. The matrices that we study in this section are among the most 
important kinds of matrices encountered in linear algebra and will arise in 
many different settings throughout the text. 



Diagonal Matrices 

A square matrix in which all the entries off the main diagonal are zero is called a diagonal matrix. Here are some examples: 

6 0" 

















1 




2 







1 




n 


-i 












1 


















-4 











8 



A general nxn diagonal matrix D can be written as 



D = 



di 
d 2 








d n 



(1) 



A diagonal matrix is invertible if and only if all of its diagonal entries are nonzero; in this case the inverse of 1 is 

1/^1 - 
\fd 2 ■- 



D~ l = 











1/dy 



The reader should verify that £}£) -1 =D~^D = I- 



Powers of diagonal matrices are easy to compute; we leave it for the reader to verify that if D is the diagonal matrix 1 and k is a 
positive integer, then 



D k = 



d\ 
d$ 



- dl 



EXAMPLE 1 Inverses and Powers of Diagonal Matrices 



If 



A = 



1 


0" 





-3 





2 



then 



A~ l = 









1 


n 


1 







i 




2 



A 5 = 






0" 


243 








32 



A~ 5 = 









1 

243 








1 

32 



Matrix products that involve diagonal factors are especially easy to compute. For example, 











1 


d 2 








d 3 \ 



flu 


a\2 


313] 


321 


322 


323 


331 


332 


333 


a 41 


« 42 


fl 43 J 



ail «12 «13 «14 
321 ^22 «23 ^24 
331 «32 «33 3 34 



^1 





" 




d 2 


= 


d 3 





(3? 1(3H L3? 1(3 12 ^1^13 ^ l" 3 14 

^2^21 ^22 ^2^23 ^24 

d^a^i d2&32 ^3^33 ^31334 

dflflll d 2 a\2 ^3^13 

d\a 2 \ d 2 a 22 d 3 a 23 

df lfl31 d?2^32 ^33 

df 11341 df 2^42 ^3"343 



In words, fo multiply a matrix A on the left by a diagonal matrix D, one can multiply successive rows of A by the successive 
diagonal entries ofD, and to multiply A on the right by D, one can multiply successive columns of A by the successive diagonal 
entries ofD. 

Triangular Matrices 

A square matrix in which all the entries above the main diagonal are zero is called lower triangular, and a square matrix in 
which all the entries below the main diagonal are zero is called upper triangular. A matrix that is either upper triangular or 
lower triangular is called triangular. 



EXAMPLE 2 Upper and Lower Triangular Matrices 



an 


312 


313 


314 





322 


323 


324 








333 


334 











344 



t 

A general 4x4 upper 
triangular matrix 



311 











321 


322 








331 


332 


333 





341 


342 


343 


344 



A general 4x4 lower 
triangular mat rin 



Remark Observe that diagonal matrices are both upper triangular and lower triangular since they have zeros below and above 
the main diagonal. Observe also that a square matrix in row-echelon form is upper triangular since it has zeros belowthe main 
diagonal. 

The following are four useful characterizations of triangular matrices. The reader will find it instructive to verify that the 
matrices in Example 2 have the stated properties. 

* 
A square matrix ^4 = [ay ] is upper triangular if and only if the /th row starts with at j — l zeros. 



A square matrix ^4 = [a^] is lower triangular if and only if the j th column starts with j — 1 zeros. 



A square matrix ^4 = [a^ ] is upper triangular if and only if a^ = for i > j. 
A square matrix ^4 = [ay ] is lower triangular if and only if # 2 - ■ = for i < j. 

The following theorem lists some of the basic properties of triangular matrices. 
THEOREM 1.7.1 



(a) The transpose of a lower triangular matrix is upper triangular, and the transpose of an upper triangular matrix is 
lower triangular. 

(b) The product of lower triangular matrices is lower triangular, and the product of upper triangular matrices is upper 
triangular. 

(c) A triangular matrix is invertible if and only if its diagonal entries are all nonzero. 

(d) The inverse of an invertible lower triangular matrix is lower triangular, and the inverse of an invertible upper 
triangular matrix is upper triangular. 



Part {a) is evident from the fact that transposing a square matrix can be accomplished by reflecting the entries about the main 
diagonal; we omit the formal proof. We will prove (&), but we will defer the proofs of (c) and (d) to the next chapter, where we 
will have the tools to prove those results more efficiently. 



Proof (b) We will prove the result for lower triangular matrices; the proof for upper triangular matrices is similar. Let 

A = [atf] and B= [h^] be lower triangular w x « matrices, and let C= [cu] be the product C = AB> From the remark preceding 

this theorem, we can prove that C is lower triangular by showing that cu = for i < ;. But from the definition of matrix 

multiplication, 

If we assume that i<j, then the terms in this expression can be grouped as follows: 

Cjj = anbij + aqhj + - + a i(j-V) b (j-V)j + flij^jj H l-fl^Hj 

Terms in which the row Terms in which the row 

number of b is less than the number of a is less than the 

column number of b column number of a 

In the first grouping all of the b factors are zero since B is lower triangular, and in the second 
grouping all of the a factors are zero since A is lower triangular. Thus, c^ = 0, which is what we 
wanted to prove. 



EXAMPLE 3 Upper Triangular Matrices 



Consider the upper triangular matrices 



A = 



1 3 -1 
2 4 
5 



B = 



3 


-2 


2" 





2 


-1 








1 



The matrix A is invertible, since its diagonal entries are nonzero, but the matrix B is not. We leave it for the reader to calculate 
the inverse of A by the method of Section 1.5 and show that 





1 


3 
2 


7 " 

5 


A~ l = 





1 
2 


2 

5 










1 
5 



This inverse is upper triangular, as guaranteed by part (d) of Theorem 1.7.1. We also leave it for the reader to check that the 
product AB is 

~3 -2 -2 

AB= 2 

5 

This product is upper triangular, as guaranteed by part (b) of Theorem 1.7.1. 

Symmetric Matrices 

A square matrix A is called symmetric if A = A T - 



EXAMPLE 4 Symmetric Matrices 



The following matrices are symmetric, since each is equal to its own transpose (verify). 

7 -3' 
-3 5 



1 


4 5" 




4 


-3 


7 


3 


7 





*l 


" 


d 2 





d 3 








d?4 



It is easy to recognize symmetric matrices by inspection: The entries on the main diagonal may be arbitrary, but as shown in 
2,"mirror images" of entries across the main diagonal must be equal. 

h 



4 ^C A) 







(2) 



This follows from the fact that transposing a square matrix can be accomplished by interchanging entries that are symmetrically 
positioned about the main diagonal. Expressed in terms of the individual entries, a matrix ^4 = [ay ] is symmetric if and only if 
a^j = fljj for all values of i and j. As illustrated in Example 4, all diagonal matrices are symmetric. 

The following theorem lists the main algebraic properties of symmetric matrices. The proofs are direct consequences of 
Theorem 1.4.9 and are left for the reader. 



THEOREM 1.7.2 



If A and B are symmetric matrices with the same size, and ifk is any scalar, then: 



(a) A t is symmetric. 



(b) A 4- B and A — B are symmetric. 



( c ) kA ^ symmetric. 



Remark It is not true, in general, that the product of symmetric matrices is symmetric. To see why this is so, let A and B be 
symmetric matrices with the same size. Then from part (d) of Theorem 1.4.9 and the symmetry, we have 

(AB) T = 3 T A T = 3A 

Since AB and BA are not usually equal, it follows that AB will not usually be symmetric. However, in the special case where 
AB = BA> the product AB will be symmetric. If A and B are matrices such that AB = BA, then we say that A and B commute. In 
summary: The product of two symmetric matrices is symmetric if and only if the matrices commute. 



EXAMPLE 5 Products of Symmetric Matrices 



The first of the following equations shows a product of symmetric matrices that is not symmetric, and the second shows a 
product of symmetric matrices that is symmetric. We conclude that the factors in the first equation do not commute, but those in 
the second equation do. We leave it for the reader to verify that this is so. 





"1 


2" 


~-4 f 




' -2 


r 




2 3_ 


1 0_ 




-5 2_ 


"1 2" 


" -4 3" 




"2 f 




2 


3 




3 -1 




1 3 





In general, a symmetric matrix need not be invertible; for example, a square zero matrix is symmetric, but not invertible. 
However, if a symmetric matrix is invertible, then that inverse is also symmetric. 



THEOREM 1.7.3 




Proof Assume that A is symmetric and invertible. From Theorem 1 .4. 10 and the fact that a = A T > we have 



(^ _1 ) =(A T ) =A~ l 



which proves that a is symmetric. 



Products AA T and A J A 



Matrix products of the form jy[ 7 and j[ ^ arise in a variety of applications. If A is an m x n matrix, then j[ 7 is an n x m matrix, 
so the products jy^ and a^A are both square matrices — the matrix jy^ has size mxm, and the matrix a^A has size n-xn- 
Such products are always symmetric since 



.7\ r aT 



T, „7\ 



{AA 1 ) ={A') A 1 =AA 2 and (A 2 A) =A i (A 2 ) =A 2 A 



EXAMPLE 6 The Product of a Matrix and Its Transpose Is Symmetric 



Let A be the 2 x 3 matrix 



Then 



A = 



1 -2 4 
3 0-5 



A T A = 



AA T = 



1 3 

■2 

4 -5 



1 -2 4 
3 0-5 



1 -2 4 
3 0-5 



1 


3" 




-2 





= 


4 


-5 





10 -2 -11 

-2 4 -8 

-11 -8 41 



21 -17 
17 34 



Observe that a t A and ji^ 7 " are symmetric as expected. 

Later in this text, we will obtain general conditions on A under which AA t and A T A are invertible. However, in the special case 
where A is square, we have the following result. 

THEOREM 1.7.4 




Proof Since A is invertible, so is ^ by Theorem 1.4.10. Thus AA T an d A^A are invertible, since they are the products of 
invertible matrices. 



Exercise Set 1 .7 



® 



Click here for Just Ask! 



1. 



Determine whether the matrix is invertible; if so, find the inverse by inspection. 



(a) 



2 
-5 



(b) 



"4 





0" 

















5 



(c) 



10 
2 

I 

3 



Compute the product by inspection. 



(a) 



3 


0" 


2 f 





-1 


-4 1 





2 


2 5 



(b) 





1 
4 



4 


-1 


3] 


1 


2 





-5 


1 


-:J 



-3 
5 
2 



Find j4 2 , j4 -2 , and a — k by inspection. 



(a) j4 = 



1 
-2 



(b) 



^4 = 



"l 

2 











1 

3 











1 

4 



Which of the following matrices are symmetric? 



(a) 



2 -1 

1 2 



(b) 



3 4 

4 



(c) 



2 

-1 

3 



-1 3 
5 1 
1 7 



(d) 



"0 





r 





2 





3 









By inspection, determine whether the given triangular matrix is invertible. 



(a) 



(b) 





1 2 4 
3 
5 







1 -2 


5 





1 5 


6 





-3 


1 








5 



Find all values of a, b, and c for which A is symmetric. 



A = 



2 a — 2b 4- 2c 2a + & + c 

3 5 a + c 
0-2 7 



Find all values of a and & for which A and 6 are both not invertible. 



A = 



a+b-\ 
3 



5 = 



5 

2a-2b-l 



Use the given equation to determine by inspection whether the matrices on the left commute. 



(a) 



(b) 



1 


-3 


4 1 




1 


-5 


3 


2_ 


1 2_ 




_-10 


1 


2 


-f 


"3 2" 




"4 3" 




1 


3 


2 1 




3 1 





9. 



Show that A and B commute if a — d = lb- 



A = 



2 1 
1 -5 



B 



a b 
b d 



10. 



Find a diagonal matrix A that satisfies 



(a) 



A 5 = 



1 





0" 





-1 











-1 



(b) 



A~ 2 = 



"9 





0" 





4 











1 



11. 



(a) Factor A into the form ^ = BD, where D is a diagonal matrix. 



A = 



3a\\ 5a\2 7a\3 
3a 2 \ 5a 2 2 7a 2 3 
3(33i 5^32 ^ fl 33 



(b) Is your factorization the only one possible? Explain. 



12. 



Verify Theorem 1.7.1ft for the product AB, where 



A = 



-1 2 


5 




1 


3 


, B = 





-A 





2 


-8 0" 





2 1 





3 



13. 



Verify Theorem 1.7.1 d for the matrices A and B in Exercise 12. 



14. 



Verify Theorem 1.7.3 for the given matrix A. 



(a) 



(b) 



A = 



A = 



2 


-1 




-1 


3_ 




1 


-2 


3 


-2 


1 


-7 


3 


-7 


4 



15. 



Let A be an n x n symmetric matrix. 



(a) Show that a 2 is symmetric. 

(b) Show that 2^4 2 — 3^4 + / is symmetric. 

Let A be an ^ x ^ symmetric matrix. 
16. 

(a) Show that A k is symmetric if k is any nonnegative integer. 

(b) If p(x) is a polynomial, is p(A) necessarily symmetric? Explain. 
Let A be an ^ x ^ upper triangular matrix, and let p(%) be a polynomial. Is p(A) necessarily upper triangular? Explain. 

Prove: If A T A — ^4, then A is symmetric and A — A 2 - 
18. 

Find all 3 x3 diagonal matrices A that satisfy A 2 — 3 A — 41 = 0- 
19. 

Let ^4 = [a ]? ] be an ft x tf matrix. Determine whether A is symmetric. 
20. J 



17. 



(a) aiJ =i 2 + j 2 

(b) aiJ =i 2 -j 2 

(c) aij = 2i + 2j 

(d) a ij = 2i 2 + 2j 3 

On the basis of your experience with 20, devise a general test that can be applied to a formula for a^ to determine whether 
[ay] is symmetric. 

A square matrix A is called skew-symmetric if A T = —A- Prove: 



21- j4— [fly] is symmetric 



22. 

(a) If A is an invertible skew-symmetric matrix, then ^-1 is skew-symmetric. 



(b) If A and B are skew-symmetric, then so are a t > ^4 I B, A — B, and kA for any scalar k. 



(c) Every square matrix A can be expressed as the sum of a symmetric matrix and a skew-symmetric matrix. 



Hint Note the identity ,4 = ^(,4 I ^ J ) I ^(^-^ J ) 



i*\ . i 



We showed in the text that the product of symmetric matrices is symmetric if and only if the matrices commute. Is the 

23. product of commuting skew-symmetric matrices skew-symmetric? Explain. 

Note See Exercise 22 for terminology. 

If the n x n matrix A can be expressed as A = £ jj, where L is a lower triangular matrix and U is an upper triangular matrix, 

24. then the linear system Ax = h can be expressed as LUx = h and can be solved in two steps: 

Step 1. Let Ux = y, so that LUx = h can be expressed as Ly = h. Solve this system. 



Step 2. Solve the system Ux = y for x. 



In each part, use this two-step method to solve the given system. 



(a) 



1 

2 3 
2 4 1 



2 


-1 3" 


"*l" 




"1 





1 2 


*2 


= 


-2 





4 


*3 








(b) 



2 


0] 


4 


1 


3 


-2 3j 



5 2" 


"*l" 




"4 


4 1 


*2 


= 


-5 


2 


*3 




2 



25. 



Find an upper triangular matrix that satisfies 



A 2 = 



1 30 
-8 



Discussion 

DiSGOVerV What is the maximum number of distinct entries that an n x n symmetric matrix can have? 



26. Explain your reasoning. 



27. 



Invent and prove a theorem that describes how to multiply two diagonal matrices. 



Suppose that A is a square matrix and D is a diagonal matrix such that A£> = l What can you say 
28. about the matrix A? Explain your reasoning. 



29. 


(a) 


Make up a consistent linear system of five equations in five unknowns that has a 
triangular coefficient matrix with no zeros on or below the main diagonal. 


lower 




(b) 


Devise an efficient procedure for solving your system by hand. 






(c) 


Invent an appropriate name for your procedure. 




30. 


Indicate whether the statement is always true or sometimes false. Justify each answer. 






(a) 


If 


AA T is singular, then so is A. 






(b) If 

(c) If 


A | B is symmetric, then so are A and B. 






A is an n x n matrix and Ax = has only the trivial solution, then so does a T x 


= 




(d) 


If 


A 2 is symmetric, then so is A. 
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Chapter 1 



Supplementary Exercises 



1. 



Use Gauss-Jordan elimination to solve for x 1 and y in terms of x and y. 



*=¥-¥ 

y = ¥ + iy' 



2. 



Use Gauss-Jordan elimination to solve for x f and y in terms of x and y 

x = x f co$9 — y f $m9 
y = x f sm9+y f cos9 



3. 



Find a homogeneous linear system with two equations that are not multiples of one another and such that 



and 

are solutions of the system. 






A box containing pennies, nickels, and dimes has 13 coins with a total value of 83 cents. How many coins of each type 
4. are in the box? 



5. 



Find positive integers that satisfy 

x+ y+ z= 9 
x + 5y+ ltk = 44 



6. 



For which value(s) of a does the following system have zero solutions? One solution? Infinitely many solutions? 

^3 = 2 
(a 2 -4)x 3 = a-2 



Let 



a 





b 


2" 


a 


a 


4 


4 





a 


2 


b 



be the augmented matrix for a linear system. Find for what values of a and b the system has 



(a) a unique solution. 



(b) a one-parameter solution. 



(c) a two-parameter solution. 



(d) no solution. 



8. 



Solve for x 9 y, and z. 

xy- 2 /y I 3zy = 8 

2xy- 3^y f 2zy = 1 

-xy+ ^y + 2zy = 4 



Find a matrix K such that AKB = C given that 



,4 = 



1 


4" 




-2 


3 


, 5 = 


1 


-2 





2 
1 -1 



C = 



8 


6 


-6" 


6 


-1 


1 


-4 









How should the coefficients a, b, and c be chosen so that the system 
10 * ax + by-3z= -3 

— 2x — by 4- cz = — 1 
ax -I 3y — cz = — 3 
has the solution X —\,y— — 1, and z = 2? 



11. 



In each part, solve the matrix equation for X. 



(a) 



-1 
1 1 
3 1 



1 2 
-3 1 5 



(b) 



X 



1 -1 2 

3 1 



-5 
6 



-1 

-3 7 



(c) 



3 f 


JT-JT 


"1 4" 




"2 


-2" 


_ -1 2_ 




2 0_ 




_5 


4_ 



12. 



(a) Express the equations 



and 



y\=xi-x 2 + X3 
y 2 = 3x]_ +*2-4*3 
yi = — 2^ri — 2^2 I 3*3 

^l =4yi-.y2 I ^3 

^2= -3jl I 5^2-73 



in the matrix forms y=AX and z = B¥. Then use these to obtain a direct relationship 
Z = CX between Z and X. 

(b) Use the equation Z = CX obtained in (a) to express z\ and Z2 in terms of x\ 9 X2, and x^. 



(c) Check the result in (b) by directly substituting the equations for y 1, j?2, and y^ into the equations for z\ and z 2 
and then simplifying. 



If A is m x m and B is « x £>, how many multiplication operations and how many addition operations are needed to 
13- calculate the matrix product A5? 

Let A be a square matrix. 
14. 



(a) Show that (/-^) _1 =/ + ^ + ^ 2 + ^ 3 if ^ = 0. 

(b) Show that (/ - A) ~ l = I + A + ^4 2 + - + A" if ,4 M+1 = 0- 



Find values of a, b, and c such that the graph of the polynomial p (x) = ax + bx 4 c passes through the points (1,2), 
15 * (-1,6), and (2, 3). 



16. (For Readers Who Have Studied Calculus) Find values of a, b, and c such that the graph of the polynomial 
p(x) = ax + bx + c passes through the point (-1, 0) and has a horizontal tangent at (2, -9). 



Let j be the M x n matrix each of whose entries is 1. Show that if M > ], then 
17 

(i-j n y l =i- 1 ±j3 n 

Show that if a square matrix A satisfies A^ I A A 1 — 2A 4- 11 = 0> then so does a t . 
18. 

Prove: If B is invertible, then j\£ _1 = £ _1 ^ if and only if ^ — BA- 
19. 

Prove: If A is invertible, then A I B and l + 5^4 -1 are both invertible or both not invertible. 
20. 

Prove that if A and B are n x « matrices, then 
21. 

(a) tr(A I 5)=tr(A) I- tr(5) 

(b) trCt J 4)=ttr( J 4) 



(c) tr(A T )=b(A) 



(d) tr(AB)=ft(BA) 



22. 



Use Exercise 21 to show that there are no square matrices A and B such that 

AB-BA=\ 



23. 



Prove: If A is an m x n matrix and B is the n x 1 matrix each of whose entries is 1 / n , then 



AB = 



n 



where F is the average of the entries in the /th row of A. 



24. (For Readers Who Have Studied Calculus) If the entries of the matrix 







cnCO cn(x) ■■ 


• c\n&) 




C = 


C2lO) c^C*) " 


• C2hO) 






c m l(*) c m2 (z) - 


' c mnV l ) 


are differentiable functions of x, 


then we define 








c iiCO c{ 2 (*) 


- 4wl 




dC 
dx 


c 2lW C 22W 


- iw 






c ilOO c'rraW 


" J c mn\ x ) 



Show that if the entries in A and S are differentiable functions of x and the sizes of the 
matrices are such that the stated operations can be performed, then 



(a) jL (kA )=k^A 
dx dx 



(b) jL {A + B) = iA A 4B_ 

dx dx dx 



(c) jL (A£) =dA B + A <iB_ 

dx dx dx 



25. (For Readers Who Have Studied Calculus) Use part (c) of Exercise 24 to show that 



dA 



-1 



= .,4-1444-1 



dx dx 

State all the assumptions you make in obtaining this formula. 



26. 



Find the values of a, b, and c that will make the equation 



jT + jt-2 



a bx + c 



(3*-l)(* 2 |1) 3 *- ] * 2 +l 



an identity. 



Hint Multiply through by (3x — 1) (x 2 I 1) and equate the corresponding coefficients of the polynomials on each side 
of the resulting equation. 

If P is an M x 1 matrix such that p T p= 1 , then H = 1 — 2PP T is called the corresponding Householder matrix (named 
*'* after the American mathematician A. S. Householder). 

(a) Verify that p T p= 1 if P T = [--- — —] and compute the corresponding Householder matrix. 



(b) Prove that if H is any Householder matrix, then ^ _ ^ T and jj ^ _ I 



(c) Verify that the Householder matrix found in part (a) satisfies the conditions proved in part (b). 



28. 



Assuming that the stated inverses exist, prove the following equalities. 



(a) ,,-t-l , n -K _ 



-1; 



(C _1 +D _1 ) =C(C + D)- l D 



-1 



(b) (/ | CD)~ l C = C(I + DC) 



-1 



(c) 



(C + DD T ) l D = C~ l D(I + D T C~ l D) 



29. 



(a) Show that if a ^ i, then 



a n + a n - X b + a n - 2 b 2 + - + ab n - X +b n = a ~ b 



a—b 



(b) Use the result in part (a) to find A" if 



A = 



a 





0" 





b 





1 





c 



Note This exercise is based on a problem by John M. Johnson, The Mathematics Teacher, Vol. 85, No. 9, 1992. 
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Chapter 1 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Section 1.1 

Tl. Numbers and Numerical Operations Read your documentation on entering and displaying numbers and performing the 
basic arithmetic operations of addition, subtraction, multiplication, division, raising numbers to powers, and extraction of 
roots. Determine how to control the number of digits in the screen display of a decimal number. If you are using a CAS, in 
which case you can compute with exact numbers rather than decimal approximations, then learn how to enter such numbers 
as 7T, ^2, and ]- exactly and convert them to decimal form. Experiment with numbers of your own choosing until you feel you 
have mastered the procedures and operations. 

Section 1 .2 

Tl. Matrices and Reduced Row-Echelon Form Read your documentation on how to enter matrices and how to find the 

reduced row-echelon form of a matrix. Then use your utility to find the reduced row-echelon form of the augmented matrix in 
Example 4 of Section 1.2. 



T2. Linear Systems With a Unique Solution Read your documentation on how to solve a linear system, and then use your 
utility to solve the linear system in Example 3 of Section 1.1. Also, solve the system by reducing the augmented matrix to 
reduced row-echelon form. 



T3. Linear Systems With Infinitely Many Solutions Technology utilities vary on how they handle linear systems with infinitely 
many solutions. See how your utility handles the system in Example 4 of Section 1.2. 



T4. Inconsistent Linear Systems Technology utilities will often successfully identify inconsistent linear systems, but they can 
sometimes be fooled into reporting an inconsistent system as consistent, or vice versa. This typically happens when some of 
the numbers that occur in the computations are so small that roundoff error makes it difficult for the utility to determine 
whether or not they are equal to zero. Create some inconsistent linear systems and see how your utility handles them. 

A polynomial whose graph passes through a given set of points is called an interpolating polynomial for those points. Some 
T5. technology utilities have specific commands for finding interpolating polynomials. If your utility has this capability, read the 
documentation and then use this feature to solve Exercise 25 of Section 1.2. 

Section 1 .3 



Tl. Matrix Operations Read your documentation on how to perform the basic operations on matrices — addition, subtraction, 
multiplication by scalars, and multiplication of matrices. Then perform the computations in Examples Example 3, Example 4, 
and Example 5. See what happens when you try to perform an operation on matrices with inconsistent sizes. 



T2. 



Evaluate the expression A 5 — 3 A 3 + 7^4 — 4/ for the matrix 



,4 = 



1 -2 3 

-4 5 -6 

7-8 9 



T3. Extracting Rows and Columns Read your documentation on how to extract rows and columns from a matrix, and then use 
your utility to extract various rows and columns from a matrix of your choice. 



T4. Transpose and Trace Read your documentation on how to find the transpose and trace of a matrix, and then use your utility 
to find the transpose of the matrix A in Formula (12) and the trace of the matrix B in Example 12. 



T5. Constructing an Augmented Matrix Read your documentation on how to create an augmented matrix [j4|h] from matrices 
A and b that have previously been entered. Then use your utility to form the augmented matrix for the system Ax: = b in 
Example 4 of Section 1.1 from the matrices A and b. 



Section 1 .4 



Tl. Zero and Identity Matrices Typing in entries of a matrix can be tedious, so many technology utilities provide shortcuts for 
entering zero and identity matrices. Read your documentation on how to do this, and then enter some zero and identity 
matrices of various sizes. 



T2. Inverse Read your documentation on how to find the inverse of a matrix, and then use your utility to perform the 
computations in Example 7. 



T3. Formula for the Inverse If you are working with a CAS, use it to confirm Theorem 1.4.5. 



T4. Powers of a Matrix Read your documentation on how to find powers of a matrix, and then use your utility to find various 
positive and negative powers of the matrix A in Example 8. 



Let 



T5. 



,4 = 



1 


1 


1 " 




2 


3 


1 


1 


1 


4 




5 


1 


1 


1 


6 


7 





Describe what happens to the matrix j[ k when k is allowed to increase indefinitely (that is, as fc 



-* DO, 



T6. 



By experimenting with different values of n, find an expression for the inverse of an M x w matrix of the form 

1 2 3 4 - m-1 « " 

123-m-2m-1 
00 12-m-3m-2 



,4 = 








1 2 

1 



Section 1 .5 

Use your technology utility to verify Theorem 1.5.1 in several specific cases. 



T2. Singular Matrices Find the inverse of the matrix in Example 4, and then see what your utility does when you try to invert 
the matrix in Example 5. 

Section 1 .6 

Tl. Solving Ax. = b by Inversion Use the method of Example 4 to solve the system in Example 3 of Section 1.1. 

Compare the solution of Ax = b by Gaussian elimination and by inversion for several large matrices. Can you see the 
T2. superiority of the former approach? 



T3. 



Solve the linear system Ax = 2x> given that 



,4 = 



0-2 

1 2 1 
1 3 



Section 1 .7 



Tl. Diagonal, Symmetric, and Triangular Matrices Many technology utilities provide short-cuts for entering diagonal, 

symmetric, and triangular matrices. Read your documentation on how to do this, and then experiment with entering various 
matrices of these types. 



T2. Properties of Triangular Matrices Confirm the results in Theorem 1.7.1 using some triangular matrices of your choice. 



T3. 



Confirm the results in Theorem 1.7.4. What happens if A is not square? 
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2 



CHAPTER 



Determinants 



INTRODUCTION: We are all familiar with functions such as f(x) = sin* and f(x) = x 2 , which associate a real number 
/ (x) with a real value of the variable x . Since both x and f ( x ) assume only real values, such functions are described as 
real-valued functions of a real variable. In this section we shall study the "determinant function," which is a real-valued 
function of a matrix variable in the sense that it associates a real number f (X) with a square matrix X- Our work on 
determinant functions will have important applications to the theory of systems of linear equations and will also lead us to an 
explicit formula for the inverse of an invertible matrix. 
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2.1 

DETERMINANTS BY 
COFACTOR EXPANSION 



As noted in the introduction to this chapter, a "determinant" is a certain kind 
of function that associates a real number with a square matrix. In this 
section we will define this function. As a consequence of our work here, we 
will obtain a formula for the inverse of an invertible matrix as well as a 
formula for the solution to certain systems of linear equations in terms of 
determinants. 



Recall from Theorem 1.4.5 that the 2 x 2 matrix 



,4 = 



h 
d 



is invertible if ad — hc^O- The expression a d — he occurs so frequently in mathematics that it has a name; it is called the 
determinant of the matrix ji and is denoted by the symbol det(y4) or \A\. With this notation, the formula for j4 _1 given in 
Theorem 1.4.5 is 



A~ l =- 



1 



d -b 
— c a 



det(A) 

One of the goals of this chapter is to obtain analogs of this formula to square matrices of higher order. This will require that 
we extend the concept of a determinant to square matrices of all orders. 



Minors and Cofactors 

There are several ways in which we might proceed. The approach in this section is a recursive approach: It defines the 
determinant of an ^ x w matrix in terms of the determinants of certain (a — l)x(w — 1) matrices. The (h — 1)x(h — 1) 
matrices that will appear in this definition are submatrices of the original matrix. These submatrices are given a special 
name: 



DEFINITION 



If A is a square matrix, then the minor of entry &ij is denoted by M,, and is defined to be the determinant of the 
submatrix that remains after the jth row and Jth column are deleted from A- The number ( — l) !+J M,v is denoted by Cy 
and is called the cof actor of entry eiij. 



EXAMPLE 1 Finding Minors and Cofactors 



Let 



A = 



The minor of entry a 1 1 is 



3 1 
2 5 
1 4 



-4 
6 
8 



The cofactor ofa\\ is 

Similarly, the minor of entry a^2 is 



The cofactor of a^ is 



M,| = 



5 
4 



5 6 

4 M 



= 16 



Cii = (-l) 1+1 Mu = Mii = 16 



Mi2 = 



+ 



-4 

6 

— * 



3 -4 
2 6 



= 26 



C32 = ( - 1) 3+2 M 32 = - M 32 = - 26 



Note that the cofactor and the minor of an element a^ differ only in sign; that is, C,, = _L M,,. A quick way to determine 
whether to use + or _ is to use the fact that the sign relating C,-,- and M ,-,- is in the ;th row and j th column of the 
"checkerboard" array 

"+- + -+- 

- I - + - ■ 
I - I - + ■ 

- I - 4- - ■ 

For example, C n = M n ,C 2 \= - M 2 \, Cl2 = - M n , C 2 2 = M 2 2> and so on - 

Strictly speaking, the determinant of a matrix is a number. However, it is common practice to "abuse" the terminology 
slightly and use the term determinant to refer to the matrix whose determinant is being computed. Thus we might refer to 

3 1 

4 -2 

as a 2 x 2 determinant and call 3 the entry in the first row and first column of the determinant. 

Cofactor Expansions 

The definition of a 3 x 3 determinant in terms of minors and cofactors is 

det( J 4)=flnMn I fliaC-Mia) I a\-$M\2 
= a\\C\\-\-a\ 2 C\ 2 I fli3Ci3 

Equation 1 shows that the determinant of A can be computed by multiplying the entries in the first row of A by their 
corresponding cofactors and adding the resulting products. More generally, we define the determinant of an n x n matrix to 
be 

det(>4) =aiiCn I cti 2 C\ 2 -\---\-a\ yi Ci n 
This method of evaluating det(j4) is called cofactor expansion along the first row of A- 



(1) 



EXAMPLE 2 Cofactor Expansion Along the First Row 



Let;4 = 



3 1 
2-4 3 
5 4-2 



Evaluate det(y4) by cofactor expansion along the first row of A- 



Solution 

From 1, 



det(A) = 



3 1 
2 -4 
5 4 



= 3 



4 
4 



-1 



+ 



= 3(-4)-(l)(-ll) I 0=-l 



If A is a 3 x 3 matrix, then its determinant is 

6et(A) = 



a u 


a 12 -313 












a 2\ a 71 a 22 








a 3 \ a 32 3 33 








211 


«22 "23 
"32 "33 




■«12 


«21 «23 
331 «33 


1 313 


«21 
331 


« 22 
« 32 



(2) 



= ^11(^22^33-^23^32) -^12(^21^33- ^23^31) I ^13(^21^32-^22^31) 
= flilfl22 fl 33 I a \2 a 23 a 3\ I fl 13 fl 21 fl 32 — fl 13 fl 22 fl 31 — fl 12 fl 21 fl 33 — fl ll fl 23 fl 32 

By rearranging the terms in 3 in various ways, it is possible to obtain other formulas like 2. There should be no trouble 
checking that all of the following are correct (see Exercise 28): 



(3) 



det(A) =tfiiCii H 


hfli2Ci2H 


^13^13 


= isilCii H 


hfl2lC21 H 


^31^31 


= ^21^21 H 


r-i322C22H 


a 23^23 


= ^12^12 H 


h (322^22 H 


fl 32^32 


= ^31^31 H 


1- ^32^32 H 


a^C 33 


= ^13^13 H 


1- ^23^23 H 


a 33^33 



(4) 



Note that in each equation, the entries and cofactors all come from the same row or column. These equations are called the 
cofactor expansions of det(y4)- 

The results we have just given for 3 x 3 matrices form a special case of the following general theorem, which we state 
without proof. 



THEOREM 2.1.1 



Expansions by Cofactors 

The determinant of an nxn matrix A can be computed by multiplying the entries in any row (or column) by their 
cofactors and adding the resulting products; that is, for each \<i<n and \<j<n. 

det(ji) =a\jC\j I a 2^27 4- - 4- ^hjChj 
(cofactor expansion along the jtli column) 

and 

































det(J) = 
(cofactor 


expansion 


]'2W2 "r '"' T" fl inW H 
along the itli row) 


















Note that 


we 


may 


choose 


any 


row or 


any 


column. 













EXAMPLE 3 Cofactor Expansion Along the First Column 



Let A be the matrix in Example 2. Evaluate det(A) by cofactor expansion along the first column of A- 



Solution 



From 4 



det(^) = 



This agrees with the result obtained in Example 2. 



3 


1 







-4 


^ 




1 


n 




1 


-2 


-4 


3 


= 3 


4 


-2 


-(-2) 


4 


-2 


i i 


-4 3 


i 


4 


-2 



















= 3(-4)-(-2)(-2) I 5(3)= -1 



Remark In this example we had to compute three cofactors, but in Example 2 we only had to compute two of them, since 
the third was multiplied by zero. In general, the best strategy for evaluating a determinant by cofactor expansion is to expand 
along a row or column having the largest number of zeros. 



EXAMPLE 4 Smart Choice of Row or Column 



If A is the 4 x 4 matrix 



A = 



10 0-1 
3 12 2 
10-2 1 
2 1 



then to find det(j4) h will be easiest to use cofactor expansion along the second column, since it has the most zeros: 

det(j4) = 1 - 



1 





-1 


1 


-2 


1 


2 





1 



For the 3x3 determinant, it will be easiest to use cofactor expansion along its second column, since it has the most zeros: 

6st(A) = l--2-l ~ ! 
2 1 

= -2(1 + 2) 

= -6 

We would have found the same answer if we had used any other row or column. 



Adjoint of a Matrix 

In a cof actor expansion we compute det(Jl) by multiplying the entries in a row or column by their cof actors and adding the 
resulting products. It turns out that if one multiplies the entries in any row by the corresponding cof actors from a different 
row, the sum of these products is always zero. (This result also holds for columns.) Although we omit the general proof, the 
next example illustrates the idea of the proof in a special case. 



EXAMPLE 5 Entries and Cofactors from Different Rows 



Let 



,4 = 



a n 


«12 


ay$~ 


<321 


^ 22 


«23 


A3i 


^ 32 


a 33 



Consider the quantity 

flll^i I -312^32 ' fl 13^33 
that is formed by multiplying the entries in the first row by the cofactors of the corresponding entries in the third row and 
adding the resulting products. We now show that this quantity is equal to zero by the following trick. Construct a new matrix 
A f by replacing the third row of A with another copy of the first row. Thus 

"an a n a 13' 
A f = ^21 a 22 ^23 
an fl 12 fl 13 

Let cL > Civ CL be the cofactors of the entries in the third row of A*. Since the first two rows of A and A f are the same, and 
since the computations of c 31 , c 32 ' C*33> CL > CU> an d cU involve only entries from the first two rows of ^4 and A*l it 
follows that 

C31 = C3I ' ^32 = ^32 - ^33 = ^33 

Since A f has two identical rows, it follows from 3 that 



det(A*) = 
On the other hand, evaluating det(A ) by cof actor expansion along the third row gives 

det(A*) =fl 11 C'3 1 +^12^2 +^13^33 =^11^31 +^12^32 I ^13^33 



(5) 



(6) 



From 5 and 6 we obtain 



flllC 3 i I ^12^32 I ai 3 C33 = 



Now we'll use this fact to get a formula for ^ 



-1 



DEFINITION 



If A is any wx « matrix and C 3J - is the cof actor of ay, then the matrix 





Cn C12 ■■■ Ci„ 






C21 C 2 2 ■" ^2 M 






^h1 ^h2 ^hh 




is called the matrix of cof actors from A. The transpose of this matrix is called the adjoint of A and is denoted by adj(y4)- 



EXAMPLE 6 Adjoint of a 3 x 3 Matrix 



Let 



The cofactors of A are 



so the matrix of cofactors is 



and the adjoint of ^4 is 



3 


2 


-1" 


1 


6 


3 


2 


-4 






A = 



Cn = 12 Co = 6 Cn= -16 

^21=4 C 22 = 2 C 23 = 16 

C 3 i = 12 C 32 = -10 C 33 = 16 



12 6-16 

4 2 16 

12 -10 16 



adjfJ) = 



12 4 12 

6 2-10 

-16 16 16 



We are now in a position to derive a formula for the inverse of an invertible matrix. We need to use an important fact that 
will be proved in Section 2.3: The square matrix A is invertible if and only if det(j4) is not zero. 



THEOREM 2.1.2 



Inverse of a Matrix Using Its Adjoint 

If A is an invertible matrix, then 


■l - -^ — *■ -tA\f A\ 


(7) 


A ~ det(A) aJj(jl) 





Proof We show first that 



Azdi(A) = det(A)I 



Consider the product 



Aadj(A) = 



The entry in the zth row and jth column of the product Aadj(A) is 



"an 
«21 


<*\2 ■ 
«22 ■ 


" fl 2M 


"Cn 


C 2 1 ■ 


■■ C/l ■ 


" Cwl 


: 


: 


; 


Cl2 


C*22 ■ 


■■ c fl ■ 


" ^m2 


an 


A 2 ■ 


ft 3M 


: 


: 


-. 


: 


a n \ 


^h2 " 


' a nn 


ClH 


^2h " 







(8) 



(see the shaded lines above). 

If i = j, then 8 is the cofactor expansion of det(jf) along the jth row of A (Theorem 2.1.1), and if i * j, then the a's and the 
cofactors come from different rows of A, s0 the value of 8 is zero. Therefore, 

det(A) - 
det(A) - 



- det(A) 

Since A is invertible, det(j4) * 0- Therefore, Equation 9 can be rewritten as 



= 6et(A)I 



1 



det(A) 
Multiplying both sides on the left by ^-1 yields 



[Aad i (A)]=I or A 



1 



A~ l = 



det(A) 



det(A) 



■ad}(A) 



= 1 



(9) 



EXAMPLE 7 Using the Adjoint to Find an Inverse Matrix 



Use 7 to find the inverse of the matrix A in Example 6. 



Solution 



The reader can check that det(j4) = 64- Thus 



A~ l = —L—adttA) = 4t 
det(A) J * J 64 



12 4 12 

6 2-10 

-16 16 16 



12 


4 


12 


64 


64 


64 


6 
64 


2 
64 


10 
64 


16 


16 


16 


64 


64 


64 



Applications of Formula 7 

Although the method in the preceding example is reasonable for inverting 3x3 matrices by hand, the inversion algorithm 
discussed in Section 1.5 is more efficient for larger matrices. It should be kept in mind, however, that the method of Section 
1.5 is just a computational procedure, whereas Formula 7 is an actual formula for the inverse. As we shall now see, this 
formula is useful for deriving properties of the inverse. 



In Section 1.7 we stated two results about inverses without proof. 



Theorem 1.7.1c: A triangular matrix is invertible if and only if its diagonal entries are all nonzero. 

* 
Theorem IJ.ld: The inverse of an invertible lower triangular matrix is lower triangular, and the inverse of an invertible 
upper triangular matrix is upper triangular. 

We will now prove these results using the adjoint formula for the inverse. We need a preliminary result. 
THEOREM 2.1.3 



If A is an n x n triangular matrix (upper triangular, lower triangular, or diagonal), then det(y4) is the product of the 
entries on the main diagonal of the matrix; that is, det(j4) = (3lll322■ , " fl HH• 



For simplicity of notation, we will prove the result for a 4 x 4 lower triangular matrix 

'a n 



,4 = 



(321 fl 22 
(331 ^32 ^33 
fl 41 fl 42 fl 43 a A4 



The argument in the n x n case is similar, as is the case of upper triangular matrices. 



Proof of Theorem 2. 1.3 (4 x 4 lower triangular case) By Theorem 2. 1 . 1 , the determinant of A may be found by cofactor 
expansion along the first row: 



det(^) = 



an 

(321 ^22 

(33i fl 32 fl 33 

t341 #42 t343 t344 



= (311 



a 22 





a 32 


1333 


"342 


1343 (344 



Once again, it's easy to expand along the first row: 

det(j4) =<3n<322 



(333 
(343 fl 44 



= c3i it322<333 F44 1 
= ^11^22^33^44 

where we have used the convention that the determinant of a 1 x 1 matrix [a] is a . 



EXAMPLE 8 Determinant of an Upper Triangular Matrix 



2 7-333 
0-3 7 5 1 
6 7 6 
9 8 
4 



= (2)(-3)(6)(9)(4)=-1296 



Proof of Theorem 1.7.1 c Let ^4 = [a^] be a triangular matrix, so that its diagonal entries are 

From Theorem 2.1.3, the matrix a is invertible if and only if 

is nonzero, which is true if and only if the diagonal entries are all nonzero. 

We leave it as an exercise for the reader to use the adjoint formula for ^ _1 to show that if A= [a^] is an invertible 
triangular matrix, then the successive diagonal entries of ^ _1 are 

J L_ _±_ 

ail' a 2 2'"""' fl «» 
(See Example 3 of Section 1.7.) 



Proof of Theorem 1.7. 1d We will prove the result for upper triangular matrices and leave the lower triangular case as an 
exercise. Assume that A is upper triangular and invertible. Since 

we can prove that ^ _1 is upper triangular by showing that adj(^) is upper triangular, or, 
equivalently, that the matrix of cofactors is lower triangular. We can do this by showing that every 
cofactor Cy with i<j (i.e., above the main diagonal) is zero. Since 

it suffices to show that each minor M^ with i< j is zero. For this purpose, let B^ be the matrix that 
results when the zth row and jth column of a are deleted, so 

My = det(%) (1Q) 

From the assumption that i<j, it follows that B^ is upper triangular (Exercise 32). Since A is upper 
triangular, its (i \ i)-st row begins with at least i zeros. But the /th row of B^ is the (i \ i)-st row of 
A with the entry in the jth column removed. Since i<j, none of the first i zeros is removed by 
deleting the ^th column; thus the /th row of B^ starts with at least i zeros, which implies that this 
row has a zero on the main diagonal. It now follows from Theorem 2.1.3 that det(5y) = and from 

10 that M y = 0- 

■ 

Cramer's Rule 

The next theorem provides a formula for the solution of certain linear systems of M equations in n unknowns. This formula, 
known as Cramer's rule, is of marginal interest for computational purposes, but it is useful for studying the mathematical 
properties of a solution without the need for solving the system. 



THEOREM 2.1.4 



Cramer's Rule 

If Ax — hi sa system of n linear equations in n unknowns such that det(A) * 0> then the system has a unique solution. 
This solution is 



K\ = 



detQ4i) 



*2 = 



detQ4 2 ) 



Xy\ 



detQ4„) 



det(^) ' " A det(A) ' ' "" det(^) 

where Aj is the matrix obtained by replacing the entries in the jth column of A by the entries in 
the matrix 



b = 



b 2 

hyy 



Proof If det(j4) * 0, then A is invertible, and by Theorem 1.6.2, x = A~^b is the unique solution of Ax = b- Therefore, by 
Theorem 2.1.2 we have 



x = J 4" 1 b = 



det(^) 



-adj(,4)b = 



det(^) 



Cn C 2 \ - C„i 
C12 C 2 2 - C M 2 



Wh ^2h 



-HH 



Multiplying the matrices out gives 



x = 



det(,4) 



b l Ci2+b 2 C 2 i + -+bnC f Q 
biC\n I i2C2„ + ■■■ + £ H C HH 



The entry in the jth row of x is therefore 

AlClj I b 2 C 2i + - + b n C ni 



X J = 



det(A) 



Now let 



a j= 



an ^12 ■- aij-i bi aij+i 
a 2 i a 22 - (3 2 j-l ^2 «2J+1 



^2h 



a„l fl„2 



fl H/-l ^M fl KJ+l 



(3 



MH 



by, 



(ID 



Since A differs from A only in the jth column, it follows that the cofactors of entries b\> b 2 > -■> b n i n A, are the same as the 
cofactors of the corresponding entries in the jth column of A- The cofactor expansion of det(A) along the jth column is 
therefore 

det(j4j) = AiCij + b 2 C 2j + - + b n C yij 

Substituting this result in 1 1 gives 

detQlj) 



X 3 dtt(A) 



EXAMPLE 9 Using Cramer's Rule to Solve a Linear System 



Use Cramer's rule to solve 



Solution 



*1 + +2*3= 6 

-3*i I 4;t 2 I 6*3 = 30 

— *i — 2*2 + 3*3 = 8 



Therefore, 



*i: 
* 3 : 





1 2 






"6 2" 




A = 


-3 4 6 


? 


A l = 


30 4 6 






-1 -2 3 






3-2 3 






"16 2" 




" 


1 6 




A 2 = 


-3 30 6 , 




^3 = 


-3 4 30 






-18 3 




- 


-1-2 8 




det^i) _40 -10 
det(^) 44 1 1 


detf^) 72 
2 det(A) 44 


det(j4 3 ) 
det(A) 


152 33 
'44 "11 











IS 

11' 




Gabriel Cramer (1704-1752) was a Swiss mathematician. Although Cramer does not rank with the great mathematicians 
of his time, his contributions as a disseminator of mathematical ideas have earned him a well-deserved place in the history 
of mathematics. Cramer traveled extensively and met many of the leading mathematicians of his day. 



Cramer's most widely known work, Introduction a Vanalyse des lignes courbes algebriques (1750), was a study and 
classification of algebraic curves; Cramer's rule appeared in the appendix. Although the rule bears his name, variations of 
the idea were formulated earlier by various mathematicians. However, Cramer's superior notation helped clarify and 



popularize the technique. 

Overwork combined with a fall from a carriage led to his death at the age of 48. Cramer was apparently a good-natured 
and pleasant person with broad interests. He wrote on philosophy of law and government and the history of mathematics. 
He served in public office, participated in artillery and fortifications activities for the government, instructed workers on 
techniques of cathedral repair, and undertook excavations of cathedral archives. Cramer received numerous honors for his 
activities. 



Remark To solve a system of « equations in n unknowns by Cramer's rule, it is necessary to evaluate n \ ] determinants of 

n x n 

matrices. For systems with more than three equations, Gaussian elimination is far more efficient. However, Cramer's rule 

does give a formula for the solution if the determinant of the coefficient matrix is nonzero. 



Exercise Set 2.1 



© 



Click here for Just Ask! 



Let 



,4 = 



1 


-2 


3~ 


6 


7 


-1 


-3 


1 


4 



(a) Find all the minors of A- 



(b) Find all the cof actors. 



Let 



,4 = 



Find 



4 


-1 


1 


6 








-3 


3 


4 


1 





14 


4 


1 


3 


2 



(a) M 13 andc 13 



(b) M 2 3 and C 2 3 

(c) M 2 2 and C 2 2 

(d) M 2 \ and C 2 i 



Evaluate the determinant of the matrix in Exercise 1 by a cofactor expansion along 

(a) the first row 

(b) the first column 

(c) the second row 

(d) the second column 

(e) the third row 

(f) the third column 

For the matrix in Exercise 1 , find 



4. 



(a) adJCJ) 



(b) ^-1 using Theorem 2.1.2 



In Exercises 5-10 evaluate det(y4) by a cofactor expansion along a row or column of your choice. 



5. ,4 = 



-3 





7" 


2 


5 


1 


-1 





5 



6. A = 



3 


3 


f 


1 





-4 


1 


-3 


5 



A = 



1 


k 


k 2 ' 


1 


k 


k 2 


_1 


k 


k 2 



8- A = 



9. 



A = 



ill k-\ 7 
2 k-3 4 
5 k I 1 k 



3 3 
2 2 

4 1 
2 10 






5" 





-2 


-3 





3 


2 



10. 



A = 



4 10 

3 3 3-10 

12 4 2 3 

9 4 6 2 3 

2 2 4 2 3 



In Exercises 1 1-14 find j[ -] - using Theorem 2.1.2. 



11. A = 



2 


5 5" 


-1 


-1 


2 


4 3 



12. A = 



13. ,4 = 



14. ,4 = 



2 3 
3 2 
2 0-4 



2 


-3 


5" 





1 


-3 








2 



2 





0" 


8 


1 





-5 


3 


6 



Let 



15. 



A = 



1 


3 


1 


f 


2 


5 


2 


2 


1 


3 


8 


9 


1 


3 


2 


2 



(a) Evaluate ^ _1 using Theorem 2. 1 .2. 



(b) Evaluate j4 _1 using the method of Example 4 in Section 1.5. 



(c) Which method involves less computation? 



In Exercises 16-21 solve by Cramer's rule, where it applies. 

7*i — 2*2 — 3 



16. 



3*i I *2 = 5 



17. 



4* + 5y = 2 

11* I y + 2z = 3 

* + 5^ + 2z = 1 



18. 



*-4^+ z= 6 
4* - y I 2z = - 1 
2*+ 2^-3z= -20 



19. 



*l-3* 2 + *3 = 4 

2x\ — *2 = — 2 

4*i -3*3= 



20. 



— *i — 4*2 I 2*3 I *4= — 32 
2*i - *2 I 7*3 I 9*4= 14 

— *i 4- *2 + 3*3+ *4= 11 
*1 — 2*2 I *3 — 4*4= —4 



21. 



3*1 - *2 I *3=4 

— *l I 7*2 — 2*3 = 1 

2*1 i 6*2 — *3 = 5 

Show that the matrix 



22. 



,4 = 



cos 9 sin0 

— sin0 cos0 

1 



is invertible for all values of 0; then find A~^ using Theorem 2.1.2. 



23. 



Use Cramer's rule to solve for y without solving for x , z, and w . 

4*H- i y+z+w= 6 

3x-\-7y— z-\- w = 1 

7* 4- 3y - 5z + 8w = - 3 

*H- i y+z+2w= 3 



24. 



Let Ax = b be the system in Exercise 23. 



(a) Solve by Cramer's rule. 



(b) Solve by Gauss-Jordan elimination. 



(c) Which method involves fewer computations? 



25. 



Prove that if det(y4) = 1 and all the entries in A are integers, then all the entries in j[~ l are integers. 



Let Ax = b be a system of M linear equations in « unknowns with integer coefficients and integer constants. Prove that if 
26- det(j4) = 1, the solution x has integer entries. 



27. 



Prove that if A is an invertible lower triangular matrix, then j{~ 1 is lower triangular. 



28. 



Derive the last cofactor expansion listed in Formula 4. 



29. 



Prove: The equation of the line through the distinct points {a\,b\) and (a 2 , #2) can ^ e written as 



= 



X 


y 


1 


a\ 


h 


1 


<*2 


b 2 


1 



30. 



Prove: (x\,y\), (x 2 , yi)' anc * (*3> yi) are c °lli near points if and only if 



*1 


y\ 1 


*2 


yi l 


*3 


73 1 



= 



31. 



(a) 



If* = 



"^il 


A 12 " 





A21 



is an "upper triangular" block matrix, where j[^ and ^2 are square matrices, then 



det(j4) = det^n) det(^ 2 2)- Use this result t0 evaluate det(^) for 



2 


-1 


2 


5 


A 


4 


3 


-1 


3 


4 








1 


3 


5 








-2 


6 


2 








3 


5 


2 



(b) Verify your answer in part (a) by using a cof actor expansion to evaluate det(j4)« 



32. 



Prove that if A is upper triangular and 5 2 v is the matrix that results when the fth row and Jth column of A are deleted, 
then 5j n is upper triangular if j < j\. 



Discussion 

Diso OV&rv What is the maximum number of zeros that a 4 x 4 matrix can have without having a zero 



33. determinant? Explain your reasoning. 



Let A be a matrix of the form 



34. 





* 


* 








" 




* 


* 











,4 = 


* 


* 













* 


* 


* 


* 


* 




* 


* 


* 


* 


* 



How many different values can you obtain for dtt(A) by substituting numerical values (not 
necessarily all the same) for the *'s? Explain your reasoning. 

Indicate whether the statement is always true or sometimes false. Justify your answer by givin 
35. a logical argument or a counterexample. 



(a) A adj (A) is a diagonal matrix for every square matrix j[. 



(b) In theory, Cramer's rule can be used to solve any system of linear equations, although 
the amount of computation may be enormous. 



(c) If A is invertible, then adj(y4) must also be invertible. 

(d) If A has a row of zeros, then so does adj(y4). 

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



^ -^ In this section we shall show that the determinant of a square matrix can be 

EVALUATING evaluated by reducing the matrix to row-echelon form. This method is important 

n ,--!-,- D M T |v . a |v .-re Dv d o\aa smce jt IS the most computationally efficient way to find the determinant of a 

U 1 1 tKNUMAN I b bY KUVV raf matrjx 

REDUCTION 



A Basic Theorem 

We begin with a fundamental theorem that will lead us to an efficient procedure for evaluating the determinant of a matrix of any 
order n . 

THEOREM 2.2.1 




Proof By Theorem 2.1.1, the determinant of A found by cofactor expansion along the row or column of all zeros is 

det(>4) = 0-Ci I 0-C 2 + - + 0-C H 
where C\, , C n are the cofactors for that row or column. Hence det(^4) is zero. 

Here is another useful theorem: 
THEOREM 2.2.2 




Proof By Theorem 2.1.1, the determinant of A found by cofactor expansion along its first row is the same as the determinant of A T 
found by cofactor expansion along its first column. 



Remark Because of Theorem 2.2.2, nearly every theorem about determinants that contains the word row in its statement is also true 
when the word column is substituted for row. To prove a column statement, one need only transpose the matrix in question, to convert 
the column statement to a row statement, and then apply the corresponding known result for rows. 

Elementary Row Operations 

The next theorem shows how an elementary row operation on a matrix affects the value of its determinant. 
THEOREM 2.2.3 



Let Abe annv,n matrix. 

(a) IfB is the matrix that results when a single row or single column of A is multiplied by a scalar fc, then det(5) = k det(j4)- 

(b) IfB is the matrix that results when two rows or two columns of A are interchanged, then det(5) = — det(j4)« 

(c) IfB is the matrix that results when a multiple of one row of A is added to another row or when a multiple of one column is 
added to another column, then det(5) = det(j4)- 



We omit the proof but give the following example that illustrates the theorem for 3x3 determinants. 



EXAMPLE 1 Theorem 2.2.3 Applied to 3 x 3 Determinants 



det(5) = 



We will verify the equation in the first row of Table 1 and leave the last two for the reader. By Theorem 2.1.1, the determinant of B 
may be found by cof actor expansion along the first row: 

ka\\ ka\2 ka\2 
a 2 \ a 2 2 ^23 
^31 "332 ^33 

= kanC\i I £#12^12 I ^33^13 
= k(a n C n I a u Cn I ^33^13) 
= jfc det(j4) 
since C\ \>C\2> an( ^ C\2 ^° not depend on the first row of the matrix, and A and B differ only in their first rows. 



Table 1 



Relationship 



Operation 



ka\\ ka\2 ka\^ 
^21 &TI ^23 
«31 ^32 ^33 



= k 



a n 


312 


313 


321 


322 


323 


■331 


332 


333 



The first row of A is multiplied by £. 



det (B) = fcdet (A) 



321 


322 


323 




311 


312 


313 


= — 


331 


332 


333 





311 


312 


313 


321 


322 


323 


331 


332 


333 



The first and second rows of A are interchanged. 



det(5) = - det(A) 



ail -I ka 2 \ a 12 I ka 2 2 a 13 I ^23 
«21 fl 22 ^23 

(331 ^32 ^33 



311 


312 


313 


321 


322 


323 


331 


332 


333 



A multiple of the second row of A is added to the first row. 



det(5) = det(A) 



Remark As illustrated by the first equation in Table 1, part (a) of Theorem 2.2.3 enables us to bring a "common factor" from any 
row(or column) through the determinant sign. 

Elementary Matrices 

Recall that an elementary matrix results from performing a single elementary row operation on an identity matrix; thus, if we let 

A = l n in Theorem 2.2.3 [so that we have det(ji) = det(7 H ) = lL then the matrix B is an elementary matrix, and the theorem yields the 

following result about determinants of elementary matrices. 



THEOREM 2.2.4 



Let E^e an nHn elementary matrix. 



(a) If E results from multiplying a row of J by fc then det(fi) = b 



(b) IfE results from interchanging two rows of [ n , then det(5) = — 1- 



( c ) IfE results from adding a multiple of one row ofj n to another, then det(fi) = 1- 



EXAMPLE 2 Determinants of Elementary Matrices 



The following determinants of elementary matrices, which are evaluated by inspection, illustrate Theorem 2.2.4. 



= 3, 



Tke second row of /4 
was multiplied by 3 . 












1 





1 














1 





1 















The first and last rows of 
/4were interchanged. 



1 








7 





1 














1 














1 



= 1 



7 times the last row of /4 
was added to the first row. 



Matrices with Proportional Rows or Columns 

If a square matrix A has two proportional rows, then a row of zeros can be introduced by adding a suitable multiple of one of the rows 
to the other. Similarly for columns. But adding a multiple of one row or column to another does not change the determinant, so from 
Theorem 2.2.1, we must have det(j4) = 0- This proves the following theorem. 



THEOREM 2.2.5 




EXAMPLE 3 Introducing Zero Rows 



The following computation illustrates the introduction of a row of zeros when there are two proportional rows: 



3 -2 

6 -4 

9 1 

1 4 



3 -2 


9 1 

1 4 



= 



The second row is 2 times the 
first, so we added — 2 times 

the first row to the second to 
introduce a row of zeros. 



Each of the following matrices has two proportional rows or columns; thus, each has a determinant of zero. 

3-1 4 -5" 

-1 4 
-2 8 



1 -2 
■4 8 

2 -4 



6 


-2 


5 


2 


5 


8 


1 


4 


9 


3 


-12 


15 



Evaluating Determinants by Row Reduction 

We shall now give a method for evaluating determinants that involves substantially less computation than the cofactor expansion 
method. The idea of the method is to reduce the given matrix to upper triangular form by elementary row operations, then compute the 
determinant of the upper triangular matrix (an easy computation), and then relate that determinant to that of the original matrix. Here 
is an example: 



EXAMPLE 4 Using Row Reduction to Evaluate a Determinant 



Evaluate det(j4) where 



A = 



1 5 



6 1 



Solution 

We will reduce A to row-echelon form (which is upper triangular) and apply Theorem 2.2.3: 



det(A) = 



1 
3 -6 
2 6 



3 -6 
1 
2 6 



= -3 



= -3 



= -3 



1 


-2 


3 







1 


5 




2 


6 


1 




1 


-2 


3 





1 


5 





10 


-5 


1 


-2 


3 





1 


5 










-55 



= (-3)(-55) 



= (-3)(-55)(l) = 165 



1 


-2 3 





1 5 





1 



The first and second rows of 
j4were interchanged. 

A common factor of 3 from 
- the first row was taken 
through the determinant sign. 

— 2 times the first row was 
added to the third row. 

— 1 times the second row 
was added to the third row. 



A common factor of — 55 
from the last row was taken 
through the determinant sign. 



Remark The method of row reduction is well suited for computer evaluation of determinants because it is computationally efficient 



and easily programmed. However, cofactor expansion is often easier for hand computation. 



EXAMPLE 5 Using Column Operations to Evaluate a Determinant 



Compute the determinant of 



,4 = 



1 








3 


2 


7 





6 





6 


3 





7 


3 


1 


-5 



Solution 

This determinant could be computed as above by using elementary row operations to reduce A to row-echelon form, but we can put A 
in lower triangular form in one step by adding _ 3 times the first column to the fourth to obtain 



det(A) = det 



1 





o" 


2 


7 








6 3 





7 


3 1 


-26 



= (l)(7)(3)(-26)=-546 



This example points out the utility of keeping an eye open for column operations that can shorten computations. 

Cofactor expansion and row or column operations can sometimes be used in combination to provide an effective method for 
evaluating determinants. The following example illustrates this idea. 



EXAMPLE 6 Row Operations and Cofactor Expansion 



Evaluate dst(A) where 



A = 



3 5 

1 2 

2 4 

3 7 



-2 


6~ 


-1 


1 


1 


5 


5 


3 



Solution 

By adding suitable multiples of the second row to the remaining rows, we obtain 



6et(A) = 






-1 


1 3 


1 


2 


-1 1 








3 3 





1 


8 



-1 


1 3 


3 3 


1 8 


-1 1 3 


3 3 


9 3 


C-l) 


3 3 
9 3 



= -18 



Cofactor expansion along the 
first column 

We added the first row to the 
third row. 

Cofactor expansion along the 
first column 



Exercise Set 2.2 



© 



Click here for Just Ask! 



Verify that det(i4) = det(^ ) for 



(a) 



A = 



2 3 

1 4 



(b) 



A = 



2-13 
1 2 4 
5-3 6 



Evaluate the following determinants by inspection. 



(a) 



3 


-17 


4 





5 


1 








-2 



(b) 



{i o 

-8 j/2 

7 

9 5 















-1 





6 


1 



(c) 



-2 


1 3 


1 


-7 4 


-2 


1 3 



(d) 



1 -2 3 
2-4 6 
5 -8 1 



Find the determinants of the following elementary matrices by inspection. 



(a) 



1 

1 










o~ 








5 








1 



(b) 



1 








o" 








1 








1 

















1 



(c) 



1 

1 

1 







-9 



1 



In Exercises 4-11 evaluate the determinant of the given matrix by reducing the matrix to row-echelon form. 



3 6 



-2 1 



-9 

-2 

5 



"0 


3 


r 


1 


1 


2 


3 


2 


4 



1 -3 

2 4 1 
5-2 2 



-2 7 -2 
1 5 



1 -2 

5 -9 

1 2 

2 8 



3 


f 


6 


3 


6 


-2 


6 


1 



2 1 


3 


f 


1 


1 


1 


2 


1 





1 


2 


3 



10. 



11. 



12. 






1 


1 


1 


1 


1 


1 


1 


2 


2 


2 


2 


1 


1 





3 


3 


3 


1 


2 








3 


3 







3 1 
■7 
1 
2 




5 


3~ 


-4 


2 





1 


1 


1 


1 


1 



Given that 



a 


b 


c 


d 


e 


f 


S 


h 


i 



= — 6, find 



(a) 



d 


e 


f 


g 


h 


i 


a 


b 


c 



(b) 



3a 3b 3c 
-d -e -j 

Ag Ah Ai 



(c) 



a-\-g b -\-h c -\-i 

d e f 

g h i 



(d) 



_ 3a -3b -3c 

d e f 

g-4d k-4e i-4f 



Use row reduction to show that 



13. 



1 1 1 
a b c 

a b c 



= (b — a)(c — a) (c — b) 



14. 



Use an argument like that in the proof of Theorem 2.1.3 to show that 



(a) 



det 









.313 





^22 


^23 


fl 3 l 


^32 


fl 33 _ 



= -^13^22^31 



(b) 



det 



a 14 

a 2 3 ^24 

^32 <333 ^34 

^41 ^42 a 43 ^44 



= a 14^23^32^41 



15. 



Prove the following special cases of Theorem 2.2.3. 



(a) 



321 


322 


323 




an 


312 


a i3 


= — 


a 3 l 


3 3 2 


a 33 





a n 


312 


fll3 


321 


322 


323 


"331 


3 3 2 


fl 33 



(b) 



a\i-\-ka2\ a 12 I i<322 a 13 I ^ fl 23 
^21 ^22 a 23 

«31 fl 3 2 ^33 



an 


«12 313 


321 


322 3 23 


«31 


3 3 2 3 33 



16. 



Repeat Exercises 4-7 using a combination of row reduction and cofactor expansion, as in Example 6. 



17. 



Repeat Exercises 8-1 1 using a combination of row reduction and cofactor expansion, as in Example 6. 



Discussion 
Discovery 



18. 



In each part, find det(j4) by inspection, and explain your reasoning. 



(a) 


A = 








1 


1 







1 









(b) 





1 


A = 


10 
10 




10 



19. 



By inspection, solve the equation 



x 5 7 




jc+1 6 


= 


2* - 1 





Explain your reasoning. 



20. 



(a) By inspection, find two solutions of the equation 



x x 

1 1 

•3 9 



= 



(b) Is it possible that there are other solutions? Justify your answer. 



How many arithmetic operations are needed, in general, to find det(ji) by row reduction? By cof actor 
21- expansion? 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



2.3 

PROPERTIES OF THE 

DETERMINANT 

FUNCTION 



In this section we shall develop some of the fundamental properties of the 
determinant function. Our work here will give us some further insight into the 
relationship between a square matrix and its determinant. One of the 
immediate consequences of this material will be the determinant test for the 
invertibility of a matrix. 



Basic Properties of Determinants 

Suppose that A and B are « x n matrices and fc is any scalar. We begin by considering possible relationships between det(A), 

det (5), and 

det(kA), det(A + B), and det(AB) 

Since a common factor of any row of a matrix can be moved through the det sign, and since each of the « rows in kA has a 
common factor of k, we obtain 



det(;b4)=yt H det(,4) 



(1) 



For example, 



ka\\ ka\2 kay$ 
fe?2i kct22 kaji 
kai\ kaji kaji 



= k- 



a n 


a\2 


■313 


«21 


«22 


«23 


a 3 \ 


«32 


a 2 3 



Unfortunately, no simple relationship exists among d&t(A), det (B), and det(A I B)- In particular, we emphasize that 
det(A I B) will usually not be equal to det(A) I det(£?) • The following example illustrates this fact. 



EXAMPLE 1 det (A S) / det(A) clet(B) 



Consider 



A = 



1 2 

2 5 



5 = 



3 1 
1 3 



,4 + 5 = 



4 3 
3 8 



We have det(^) = 1, det(B) = 8, and det(^ + B) = 23; thus 

det(j4 + B) * det 01) I- det (5) 



In spite of the negative tone of the preceding example, there is one important relationship concerning sums of determinants 
that is often useful. To obtain it, consider two 2x2 matrices that differ only in the second row: 



A = 



"321 <322 



and 5 = 



AH fli2 
£>2\ ^22 



We have 



= det 





r^n ai2] 




p*n a n~\ 




det 


«21 fl 22 


1 det 


hi *22 


= det 



Thus 



This is a special case of the following general result. 



THEOREM 2.3.1 



det(j4) + det(5) = (a i\a 2 2 ~ a 12^21 ) I C a 1 1*22 -^ 12*21 ) 

= ffll(>22H *22) -^12(^21 +hl) 
a n a u 

fl 21 I *21 fl 22 ^*22 



ail «12 

fl 21 I *21 fl 22 ^*22 



^^ j4 5> ^^ C be h x m matrices that differ 
obtained by adding corresponding entries if 

The same result holds for columns. 


only in a single row, say the r th, 
1 the r th rows of A and 5. Then 


and 


assume 


that the 


r th 


row of C 


can 


be 


det(C) 


= det(^) + det(5) 







EXAMPLE 2 Using Theorem 2.3.1 



By evaluating the determinants, the reader can check that 

1 7 5 

det 2 3 

1 + 4 + 1 7+(-l) 





"l 7 5" 




= det 


2 3 
1 4 7 


1 det 



1 7 

2 
1 



Determinant of a Matrix Product 

When one considers the complexity of the definitions of matrix multiplication and determinants, it would seem unlikely that 
any simple relationship should exist between them. This is what makes the elegant simplicity of the following result so 
surprising: We will show that if A and B are square matrices of the same size, then 



det(A3) = dettd) det(5) 

The proof of this theorem is fairly intricate, so we will have to develop some preliminary results first. We begin with the 
special case of 2 in which A is an elementary matrix. Because this special case is only a prelude to 2, we call it a lemma. 



(2) 



LEMMA 2.3.2 



IfB is an nx n matrix and E is an n x n elementary matrix, then 

det(EB) = det(5) det (5) 



Proof We shall consider three cases, each depending on the row operation that produces matrix E- 

Case 1. If E results from multiplying a row of / by £, then by Theorem 1 .5 . 1 , EB results from B by multiplying a row by 
£; so from Theorem 2.2.3a we have 

det(EB)=kdet(B) 

But from Theorem 2.2.4a we have det(fi) =k, so 

det(EB) = det(E)det(B) 

Cases 2 and 3. The proofs of the cases where E results from interchanging two rows of / or from adding a multiple of 
one row to another follow the same pattern as Case 1 and are left as exercises. 



Remark It follows by repeated applications of Lemma 2.3.2 that if B is an M x w matrix and E\> E 2 > ""' ^V are n x n 
elementary matrices, then 

dzt{E\E T E r B) = dettffi) det(E 2 ) det(E r ) det(5) G) 

For example, 

det(EiE 2 B) = detffii) det(5 2 5) = det(£i)det(£2)det(5) 

Determinant Test for Invertibility 

The next theorem provides an important criterion for invertibility in terms of determinants, and it will be used in proving 2. 
THEOREM 2.3.3 




Proof Let R be the reduced row-echelon form of A- As a preliminary step, we will show that det(^4) and det(jR) ar e both 
zero or both nonzero: Let E^ E 2 ^ ---» B r be the elementary matrices that correspond to the elementary row operations that 
produce R from A- Thus 

R = E r -E 2 EiA 
and from 3, 

det(R) = det(E r y-det(E 2 )det(Ei)det(A) 

But from Theorem 2.2.4 the determinants of the elementary matrices are all nonzero. (Keep in 
mind that multiplying a row by zero is not an allowable elementary row operation, so k*Q in this 
application of Theorem 2.2.4.) Thus, it follows from 4 that det(A) and det(jE) are both zero or both 
nonzero. Now to the main body of the proof. 
If A is invertible, then by Theorem 1.6.4 we have R = J,so det(i?) = 1^0 an d consequently det(jl) * 0- Conversely, if 



(4) 



det(j4) ^ 0, then det(R) ^ 0, so R cannot have a row of zeros. It follows from Theorem 1.4.3 that R = /, so A is invertible by 
Theorem 1.6.4. 

It follows from Theorems Theorem 2.3.3 and Theorem 2.2.5 that a square matrix with two proportional rows or columns is 
not invertible. 



EXAMPLE 3 Determinant Test for Invertibility 



"1 


2 


3" 


1 





1 


2 


4 


6 



Since the first and third rows of 

are proportional, det(y4) = 0- Thus A is not invertible. 

We are now ready for the result concerning products of matrices. 

THEOREM 2.3.4 



If A and B are square matrices of the same size, then 

det(A3) = det(^) det(5) 



Proof We divide the proof into two cases that depend on whether or not A is invertible. If the matrix A is not invertible, 
then by Theorem 1.6.5 neither is the product AB- Thus, from Theorem 2.3.3, we have det(AS) = and det(Jl) = 0, so it 
follows that det(A5) = det(J) det(5). 



Now assume that A is invertible. By Theorem 1.6.4, the matrix A is expressible as a product of elementary matrices, say 

(5) 



A = E l E T -E r 



so 

AB = E\E T E r B 
Applying 3 to this equation yields 

detf AB) = detfgp det(Ei)-det(E r ) det(B) 
and applying 3 again yields 

det(AB) = det(EiE 2 ~E r ) dettS) 
which, from 5, can be written as det(AB) = det(A) det(B) . 



EXAMPLE 4 Verifying That clet( AB) = t\e\(A) clet(B) 



Consider the matrices 



A = 



3 1 


B = 


-1 3 


AB = 


2 17 


2 1 




5 3 




3 14 



We leave it for the reader to verify that 

det(^) = 1, det(5) = - 23, and det(A3) = - 23 

Thus det(AB) = det(A) det(5), as guaranteed by Theorem 2.3.4. 

4 

The following theorem gives a useful relationship between the determinant of an invertible matrix and the determinant of its 
inverse. 

THEOREM 2.3.5 



If A is invertible, then 


Arf( -1 ~^'\ — 1 


da{A } - detCii) 



Proof Since A~ l A = 1, it follows that det(^4 l A) = del; (7). Therefore, we must have det(^4 1 )det( J 4) = 1. Since det(A) * 0, 
the proof can be completed by dividing through by det(jl)- 



Linear Systems of the Form Ax = Ax 

Many applications of linear algebra are concerned with systems of ^ linear equations in ^ unknowns that are expressed in the 
form 



j4x = Ax 



(6) 



where A is a scalar. Such systems are really homogeneous linear systems in disguise, since 6 can be rewritten as Ax — Ax — 
or, by inserting an identity matrix and factoring, as 



(XI-A)x = 



(7) 



Here is an example: 



EXAMPLES Finding A/ -A 



The linear system 



can be written in matrix form as 



which is of form 6 with 



x\ + 2x2 = Axj 
4x\ I 2*2 = Ax 2 



1 3 
4 2 



*l 
*2 



= A 



*1 
*2 



This system can be rewritten as 



or 



or 



which is of form 7 with 



,4 = 



1 3 
4 2 



and 



x = 



*1 
*2 



~*l" 




"1 3" 


~*l" 




"0" 


/2_ 




_4 2_ 


/2_ 




_0_ 



"1 0" 


~*f 




"1 3~ 


~*l" 




"0" 


_0 1_ 


/2_ 




4 2_ 


* 2 . 




_0_ 



A-l -3 
-4 A-2 



*1 
*2 



A/-A = 



A-l -3 
-4 A-2 



The primary problem of interest for linear systems of the form 7 is to determine those values of A for which the system has a 
nontrivial solution; such a value of A is called a characteristic value or an eigenvalue* of A- If A is an eigenvalue of A, then 
the nontrivial solutions of 7 are called the eigenvectors of A corresponding to A- 



It follows from Theorem 2.3.3 that the system (\J — A)x — has a nontrivial solution if and only if 

det(Xl-A) = 

This is called the characteristic equation of A', the eigenvalues of A can be found by solving this equation for \. 

Eigenvalues and eigenvectors will be studied again in subsequent chapters, where we will discuss their geometric 
interpretation and develop their properties in more depth. 



(8) 



EXAMPLE 6 Eigenvalues and Eigenvectors 



Find the eigenvalues and corresponding eigenvectors of the matrix A in Example 5. 



Solution 

The characteristic equation of A is 

det(A7 - A) = 



A-l -3 
-4 A-2 



= or A^-3A-10 = 



The factored form of this equation is (A -f 2) (A — 5) = 0, so the eigenvalues of A are A = — 2 and A = 5- 
By definition, 



x = 



*1 
*2 



is an eigenvector of A if and only if x is a nontrivial solution of ()J — A)x — ; that is, 



A-l -3 
-4 A-2 



*1 

*2 



(9) 



If \ — 2, then 9 becomes 



-3 -3 
-4 -4 



*2 



Solving this system yields (verify) x ^= — £,X2 = £> S0 the eigenvectors corresponding to A = — 2 are the nonzero solutions 
of the form 





~*l" 




' -t 


X — 


*2 




t 



Again from 9, the eigenvectors of A corresponding to A = 5 are the nontrivial solutions of 



4 -3 
-4 3 



*1 

X2 



We leave it for the reader to solve this system and show that the eigenvectors of A corresponding to A = 5 are the nonzero 
solutions of the form 



x = 



h 



Summary 

In Theorem 1.6.4 we listed five results that are equivalent to the invertibility of a matrix A- We conclude this section by 
merging Theorem 2.3.3 with that list to produce the following theorem that relates all of the major topics we have studied 
thus far. 



THEOREM 2.3.6 



Equivalent Statements 

If A is an nxn matrix, then the following statements are equivalent. 

(a) A is invertible. 

(b) Ax = has only the trivial solution. 



(c) The reduced row-echelon form of A is / . 



(d) A can be expressed as a product of elementary matrices. 



( e ) Ax = b is consistent for every MX 1 matrix £. 



(f) Ax = h has exactly one solution for every n x 1 matrix h. 



(g) det(,4)*Q. 



Exercise Set 2.3 



® 



Click here for Just Ask! 



1. 



Verify that det(kA) = k n det(A) for 



(a) 



A = 



1 2 
3 4 



£ = 2 



(b) 



,4 = 



"2 


-1 3" 


3 


2 1 


1 


4 5 



; k = -2 



Verify that det(AB) = det(A) det(5) for 



A = 



"2 


1 


0" 


3 


4 











2 



and B = 



1 


-1 3" 


7 


1 2 


5 


1 



Is detf^ + 5) = det(A) 4- det(£)? 



By inspection, explain why det(j4) = 0- 



,4 = 



-2814 
3 2 5 1 
1 10 6 5 
4-6 4-3 



4. 



Use Theorem 2.3.3 to determine which of the following matrices are invertible. 



(a) 



1 





-1" 


9 


-1 


4 


8 


9 


-1 



(b) 



4 2 8 

-2 1 -4 

3 1 6 



(c) 



f2 


-{l 


3^2 


-3 1/7 


5 


-9 



(d) 



-3 


r 


5 


6 


8 


3 



Let 



5. 



A = 



a 


b 


c 


d 


e 


I 


g 


k 


i 



Assuming that det(j4) = - 7, find 



(a) det(3A) 



(b) det(^ _1 ) 



(c) det(2,4~ 1 ) 

(d) det((2.4)~ 1 ) 



(e) 



det 



a 


S 


d~ 


b 


k 


e 


c 


i 


f 



6. 



Without directly evaluating, show that x = and x = 2 satisfy 



7. 



det 





* 2 x 


2 




2 1 1 




0-5 


directly evaluating, show t 


b + c c + a b + a 




a b c 


= 


1 1 


1 





= 



In Exercises 8-11 prove the identity without evaluating the determinants. 

a\ b\ a\+b\+c\ 
a 2 b 2 a 2 + b 2 + C2 
a^ £3 [33 + ^3 + ^3 



«1 


h 


^1 


«2 


h 


C2 


a 2 


h 


C3 



a\ + b\ fli — ii ci 




fli ii c\ 


^2 1 ^2 ^2-^2 ^2 


= -2 


G2 h c 2 


^3 -h £3 £3— £3 ^3 




^3 £3 C3 



10. 



= (i-0 



a\ a 2 &3 
b\ b 2 b 3 
c\ c 2 c 3 



11. 



12. 



fll+il^ a 2 I b 2 t &2+b 3 £ 

a\t I b\ a 2 t I b 2 a 3 t + b 3 

c\ c 2 c 3 

a i b\+ta\ c i + rb \ + sa \ 
a 2 b 2 I ta 2 c 2 I rb 2 + sa 2 
a 3 b 3 + £a 3 c 3 I rb 3 +sa 3 



For which value(s) of fc does A fail to be invertible? 



a\ ct2 


ai 


b\ b 2 


h 


Cl c 2 


C3 



®A = 



k-3 -2 
-2 k-2 



(b) 



,4 = 



"1 


2 


4" 


3 


1 


6 


k 


3 


2 



13. 



14. 



15. 



Use Theorem 2.3.3 to show that 

2 -2 -2 

sin a sin ,3 sin 7 

2 2^2 

cos ct cos /J cos 7 

1 1 1 

is not invertible for any values of q, /?, and -■ . 

Express the following linear systems in the form (M — A)x = 0. 

(a) *i I 2*2 = A*i 
2*i+ *2=A*2 

(b) 2*i I 3* 2 = A*i 
4*i + 3*2 = A*2 

( c ) 3*i I *2 =A*i 
-5*i -3*2 = A*2 

For each of the systems in Exercise 14, find 
(i) the characteristic equation; 

(ii) the eigenvalues; 



(iii) the eigenvectors corresponding to each of the eigenvalues. 



16. 



Let A and B be n x n matrices. Show that if A is invertible, then det(5) = det(j4 BA). 



17. 



(a) Express 

a\ -\-b\ c\ \ d\ 
^2 + ^2 ^2 I tf?2 

as a sum of four determinants whose entries contain no sums. 



(b) Express 



a\ -hii c\ \ d\ ^i I fi 

^2 + *2 c 2 I ^2 ^2 I /2 
(334 £3 ^3 I d?3 ^3 I /3 

as a sum of eight determinants whose entries contain no sums. 



18. 



Prove that a square matrix A is invertible if and only if a t a is invertible. 



Prove Cases 2 and 3 of Lemma Lemma 2.3.2. 



19. 



Discussion 
Discov&ry 



Let A and 5be wx « matrices. You know from earlier work that AB and BA need not be equal. 

20. is the same true for det(AB) and det(£M)? Explain your reasoning. 

Let A and B be M x w matrices. You know from earlier work that ^ is invertible if A and B axe 

21. invertible. What can you say about the invertibility of AB if one or both of the factors are 
singular? Explain your reasoning. 

Indicate whether the statement is always true or sometimes false. Justify each answer by giving 

22. a logical argument or a counterexample. 



(a) det(2,4) - 2 det(,4) 



(b) 



=w 



(c) det(/ + ^) = l + det(,4) 



(d) If det(-d) = 0, then the homogeneous system Ax = has infinitely many solutions. 



Indicate whether the statement is always true or sometimes false. Justify your answer by giving 
23. a logical argument or a counterexample. 

(a) If det(j4) = Q> then A is not expressible as a product of elementary matrices. 

(b) If the reduced row-echelon form of A has a row of zeros, then det(A) = 0- 

(c) The determinant of a matrix is unchanged if the columns are written in reverse order. 

(d) There is no square matrix A such that det(Ad^) = — 1. 

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



2.4 

A CO M BI N ATO RIAL There is a combinatorial view of determinants that actually predates matrices. In 

APPROACH TO this section we ex Pl° re this connection. 

DETERMINANTS 



There is another way to approach determinants that complements the cofactor expansion approach. It is based on permutations. 



DEFINITION 



A permutation of the set of integers { 1, 2, ..., n) is an arrangement of these integers in some order without omissions or 
repetitions. 



EXAMPLE 1 Permutations of Three Integers 



There are six different permutations of the set of integers { 1, 2, 3}. These are 

(1,2,3) (2, 1,3) (3, 1,2) 
(1,3,2) (2,3, 1) (3,2, 1) 

One convenient method of systematically listing permutations is to use a permutation tree. This method is illustrated in our next 
example. 



EXAMPLE 2 Permutations of Four Integers 



List all permutations of the set of integers { 1, 2, 3, 4}. 



Solution 



Consider Figure 2.4.1. The four dots labeled 1, 2, 3, 4 at the top of the figure represent the possible choices for the first number in 
the permutation. The three branches emanating from these dots represent the possible choices for the second position in the 
permutation. Thus, if the permutation begins (2, — , — , — ), the three possibilities for the second position are 1, 3, and 4. The two 
branches emanating from each dot in the second position represent the possible choices for the third position. Thus, if the 
permutation begins (2, 3, — , — )> the two possible choices for the third position are 1 and 4. Finally, the single branch emanating 
from each dot in the third position represents the only possible choice for the fourth position. Thus, if the permutation begins with 
(2, 3, 4, — ), the only choice for the fourth position is 1. The different permutations can now be listed by tracing out all the 
possible paths through the "tree" from the first position to the last position. We obtain the following list by this process. 




4# 3* 4* 2# 3f# 2* 4* 

Figure 2.4.1 

(1,2,3,4) (2,1,3,4) (3,1,2,4) (4,1,2,3) 

(1.2.4.3) (2,1,4,3) (3,1,4,2) (4,1,3,2) 

(1.3.2.4) (2,3,1,4) (3,2,1,4) (4,2,1,3) 
(1, 3, 4, 2) (2, 3, 4, 1) (3, 2, 4, 1) (4, 2, 3, 1) 
(1, 4, 2, 3) (2, 4, 1, 3) (3, 4, 1, 2) (4, 3, 1, 2) 
(1, 4, 3, 2) (2, 4, 3, 1) (3, 4, 2, 1) (4, 3, 2, 1) 



From this example we see that there are 24 permutations of { 1, 2, 3, 4}. This result could have been anticipated without actually 

listing the permutations by arguing as follows. Since the first position can be filled in four ways and then the second position in 

three ways, there are 4 . 3 ways of filling the first two positions. Since the third position can then be filled in two ways, there are 

4.3.2 ways of filling the first three positions. Finally, since the last position can then be filled in only one way, there are 

4 - 3 - 2 - 1 = 24 ways of filling all four positions. In general, the set { 1, 2, ..., tf) will have ^(^ _ ]) (^ _ 2)—2 - 1 = k\ different 

permutations. 

We will denote a general permutation of the set { 1, 2, ..., m} by Q 1? j 2r ---, ] n )- Here, j ^ is the first integer in the permutation, j 2 
is the second, and so on. An inversion is said to occur in a permutation (j\, J2, ---, Jn) whenever a larger integer precedes a 
smaller one. The total number of inversions occurring in a permutation can be obtained as follows: (1) find the number of integers 
that are less than j ^ and that follow jfj in the permutation; (2) find the number of integers that are less than j 2 and that follow j 2 in 
the permutation. Continue this counting process for j 3? f j H _i« The sum of these numbers will be the total number of inversions 
in the permutation. 



EXAMPLE 3 Counting Inversions 



Determine the number of inversions in the following permutations: 

(a) (6,1,3,4,5,2) 

(b) (2,4,1,3) 

(c) (1,2,3,4) 

Solution 



(a) The number of inversions is 5 + + 1 + 1 + 1 = 8- 



(b) The number of inversions is 1 _i_ ? _i_ fl — ^ . 



(c) There are zero inversions in this permutation. 



DEFINITION 



A permutation is called even if the total number of inversions is an even integer and is called odd if the total number of 
inversions is an odd integer. 



EXAMPLE 4 Classifying Permutations 



The following table classifies the various permutations of { 1, 2, 3} as even or odd. 



Permutation 


Number of Inversions 


Classification 


(1,2,3) 





even 


(1,3,2) 


1 


odd 


(2, 1, 3) 


1 


odd 


(2, 3, 1) 


2 


even 


(3, 1, 2) 


2 


even 


(3, 2, 1) 


3 


odd 



Combinatorial Definition of the Determinant 

By an elementary product from an n x n matrix A we shall mean any product of n entries from ^, no two of which come from the 
same row or the same column. 



EXAMPLE 5 Elementary Products 



List all elementary products from the matrices 



(a) 



^21 ^22 



(b) 



an 


<*12 


fll3" 


321 


<*22 


a 23 


fl 3 l 


<*32 


fl 33 



Solution (a) 

Since each elementary product has two factors, and since each factor comes from a different row, an elementary product can be 
written in the form 

where the blanks designate column numbers. Since no two factors in the product come from the same column, the column numbers 
must be 12 or 21. Thus the only elementary products are a n&22 an d a 12^21- 

Solution (b) 

Since each elementary product has three factors, each of which comes from a different row, an elementary product can be written 
in the form 

Since no two factors in the product come from the same column, the column numbers have no repetitions; consequently, they must 
form a permutation of the set { 1 , 2, 3 } . These 3\ = & permutations yield the following list of elementary products. 

^11^22^33 ^12^21^33 ^13^21^32 
fl ll fl 23 fl 32 fl 12 fl 23 fl 31 fl 13 fl 22 fl 31 

As this example points out, an ^ x w matrix A has n\ elementary products. They are the products of the form a \j 1 ^2)2 L,Jfl "j'n' where 
(Jl ? J2 ? ___, j n ) is a permutation of the set { 1 ? 2, ..., w } • By a signed elementary product from A we shall mean an elementary 
product flij 1 fl2j'2 L,Jfl "j'n multiplied by +1 or _ l. We use the + if (j 1? j 2? ..., j n ) is an even permutation and the if 
Uu J2* ---, Jn) is an odd Permutation. 



EXAMPLE 6 Signed Elementary Products 



List all signed elementary products from the matrices 



(a) 



^21 ^22 



(b) 



an 


flia 


fll 3 " 


«21 


« 22 


«23 


A 3 i 


a 32 


fl 33 



Solution 



(a) 



Elementary Product Associated Permutation Even or Odd Signed Elementary Product 



Elementary Product Associated Permutation Even or Odd Signed Elementary Product 



(b) 



^11^22 
«12«21 



(1,2) 

(2,1) 



even 



odd 



^11^22 
-312321 



Elementary Product Associated Permutation Even or Odd Signed Elementary Product 



311322^33 

311323332 
312321333 

312323331 

313321332 

313322331 



(1,2,3) 
(1,3,2) 
(2, 1, 3) 

(2, 3, 1) 
(3,1,2) 
(3, 2, 1) 



even 

odd 

odd 

even 

even 

odd 



311322333 
-311^23332 
-312321333 

312323331 

313321332 

-313^22331 



We are now in a position to give the combinatorial definition of the determinant function. 



DEFINITION 



Let A be a square matrix. We define det(j4) t0 be the sum of all signed elementary products from A- 



EXAMPLE 7 Determinants of 2 x 2 and 3x3 Matrices 



Referring to Example 6, we obtain 



(a) 



(b) 



det 



311 312 

321 322 



= 311^22-312^21 



det 



an 


^12 


ai 3 ~ 


«21 


^ 22 


«23 


<*31 


a 32 


<*33 



= ^11^22333 I 312323331 I 313^21332 



— ^13^22331 —312321333 — ^11^23332 



Of course, this definition of det(A) agrees with the definition in Section 2.1, although we will not prove this. 



These expressions suggest the mnemonic devices given in Figure 2.4.2. The formula in part (a) of Example 7 is obtained from 
Figure 2.4.2a by multiplying the entries on the rightward arrow and subtracting the product of the entries on the leftward arrow. 
The formula in part (b) of Example 7 is obtained by recopying the first and second columns as shown in Figure 2.4. 2b. The 
determinant is then computed by summing the products on the rightward arrows and subtracting the products on the leftward 
arrows. 





{a} Delciroinairt oi ii 2 X 2 matrix 
Figure 2.4.2 



lh) Dei en rri nam u fa 3 X 3 malrix 



Warning We emphasize that the methods shown in Figure 2.4.2 do not work for determinants of 4 x 4 matrices or higher. 



EXAMPLE 8 Evaluating Determinants 



Evaluate the determinants of 



,4 = 



and B = 



1 


2 3" 


4 


5 6 


7 


-8 9 



Solution 




Using the method of Figure 2.4.2a gives 

det( J 4) = (3)C-2)-n)(4)=-10 
Using the method of Figure 2.4.25 gives 

det(5) = (45) I (84) I (96) - (105) - (-48) - (-72) =240 



The determinant of A may be written as 



det(jl) =E±aij 1 fl2j 2 " 1fl Hj B 



(1) 



where ^ indicates that the terms are to be summed over all permutations (j 1? j 2? ---, Jn) an d the + or _ is selected in each term 
according to whether the permutation is even or odd. This notation is useful when the combinatorial definition of a determinant 
needs to be emphasized. 



Remark Evaluating determinants directly from this definition leads to computational difficulties. Indeed, evaluating a 4 x 4 



determinant directly would involve computing 4 \ = 24 signed elementary products, and a 10 x 10 determinant would require the 
computation oflO!=3,62S,800 signed elementary products. Even the fastest of digital computers cannot handle the computation 
of a 25 x 25 determinant by this method in a practical amount of time. 



Exercise Set 2.4 



o 



Click here for Just Ask! 



1. 



Find the number of inversions in each of the following permutations of { 1 , 2, 3, 4, 5 } . 

(a) (41352) 

(b) (53421) 

(c) (3 25 4 1) 

(d) (54321) 

(e) (12345) 

(f) (14235) 



Classify each of the permutations in Exercise 1 as even or odd. 
2. 

In Exercises 3-12 evaluate the determinant using the method of this section. 





3 


5 




3. 


-2 4 






4 1 




4. 


8 2 






-5 6 


5. 


-7 




-2 



6. 



7. 



4 /3 



a-3 5 
-3 ,3-2 



■2 7 
5 1 
3 8 



■2 
4 



-2 1 4 
3 5-7 
1 6 2 



10. 



11. 



12. 



-1 1 

3 
1 7 


2 

-5 

2 


3 
2 -1 
1 9 




5 

-4 


c -4 

2 1 
4 c-1 


3 
2 



13. 



Find all values of A for which <Jet(j4) = 0> using the method of this section. 



(a) 



A-2 1 
-5 A + 4 



(b) 



A-4 
A 2 
3 A-l 



14. 



Classify each permutation of { 1, 2, 3, 4} as even or odd. 



15. 



(a) Use the results in Exercise 14 to construct a formula for the determinant of a 4 x 4 matrix. 



(b) Why do the mnemonics of Figure 2.4.2 fail for a 4 x 4 matrix? 



16. 



Use the formula obtained in Exercise 15 to evaluate 

4-992 

-2564 

1 2-5-3 

1-2 0-2 



Use the combinatorial definition of the determinant to evaluate 



17. 



(a) 



(b) 



















■3 













•4 








- 


-1 













2 













5 
















5 
































-4 










3 





















1 












-2 
















Solve for 


X- 




x -1 

3 1-x 


= 


1 -3 

2 x -6 
1 3 x-5 



Show that the value of the determinant 

sin0 cos0 

— cos0 sin0 

sin0 — cos 3 sin0+cos0 1 



does not depend on 0, using the method of this section. 



Prove that the matrices 



,4 = 


a b 
c_ 


and 


B = 


d e 
/ 


commute if and only if 








b a — c 
e d-f 


= 





Discussion 

Discov&FV Explain why the determinant of an fl x n matrix with integer entries must be an integer, using the 



21. method of this section. 



What can you say about the determinant of an fl x n matrix all of whose entries are 1? Explain your 
22. reasoning, using the method of this section. 



23. 



(a) Explain why the determinant of an n x n matrix with a row of zeros must have a zero 
determinant, using the method of this section. 



(b) Explain why the determinant of an n x n matrix with a column of zeros must have a zero 
determinant. 



Use Formula 1 to discover a formula for the determinant of an n x n diagonal matrix. Express the 

24. formula in words. 

Use Formula 1 to discover a formula for the determinant of an n x n upper triangular matrix. 

25. Express the formula in words. Do the same for a lower triangular matrix. 
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Chapter 2 
Supplementary Exercises 

Use Cramer's rule to solve for x' and y' in terms of x and y . 

Use Cramer's rule to solve for x f and y* in terms of x and y . 

2. i # 

x = x cos& — y sinf9 

7 = x f sinf9-|- j/ costf 

By examining the determinant of the coefficient matrix, show that the following system has a nontrivial solution if and 
3- only if ct = /?. 

x 4- y +oz = 

x+ y+ t 3z = 

ax + t 3y+ z = 

Let A be a 3 x 3 matrix, each of whose entries is 1 or 0. What is the largest possible value for det(j4) ? 
4. 



5. 



(a) For the triangle in the accompanying figure, use trigonometry to show that 

b cos 7 +c cos [3 = a 
c cosct + a cos7 = i 
a cos t 3 + h cosct =c 

and then apply Cramer's rule to show that 

b 4- c —a 

cosct = ^ W — 

2bc 

(b) Use Cramer's rule to obtain similar formulas for cos 8 and cos 7. 




6. 



Use determinants to show that for all real values of X the only solution of 



x — 2y = Xx 
x -y = Xy 



is x = Q, y = 0. 



Prove: If A is invertible, then adj(j4) is invertible and 



8. 



Prove: If A is an n x n matrix, then det[adj(j4) ] = [det(y4) ] 



n-\ 



9. (For Readers Who Have Studied Calculus) Show that if f 1 (*), /2OO' g\ (x), and g2(x) are differentiable functions, 
and if 



W = 



glCO £200 



then 






/'iW /' 2 0O 

SlO) £20) 



/iW /2OO 



10. 



(a) In the accompanying figure, the area of the triangle ABC can be expressed as 

area ABC = area ADEC I area CBFB- area ADFB 

Use this and the fact that the area of a trapezoid equals ^ the altitude times the sum of 
the parallel sides to show that 





*i y\ 1 


area ABC = ^ 


*2 72 1 




*3 73 1 



/Vote In the derivation of this formula, the vertices are labeled such that the triangle is 
traced counterclockwise proceeding from {x\,y\) to (^2,72) t0 O3, 73)- For a clockwise 
orientation, the determinant above yields the negative of the area. 

(b) Use the result in (a) to find the area of the triangle with vertices (3, 3), (4, 0), ( _ 2, — 1) • 



Q *ji >' .1) 



Mx v y t ) 



D E 

Figure Ex-10 



/*(A-u>0 




11. 



Prove: If the entries in each row of an n x n matrix A add up to zero, then the determinant of A is zero. 
Hint Consider the product AX> where X is the n x 1 matrix, each of whose entries is one. 



Let Abean^xw matrix and B the matrix that results when the rows of A are written in reverse order (last row becomes 
12. the first, and so forth). How are det(y4) and det(5) related? 



13. 



Indicate how ^4 _1 will be affected if 



(a) the /th and j\h rows of A are interchanged. 



(b) the /th row of A is multiplied by a nonzero scalar, c . 



(c) c times the /th row of A is added to the j\h row. 



Let A be an n x h matrix. Suppose that B\ is obtained by adding the same number t to each entry in the /th row of A and 
that 5 2 i s obtained by subtracting £ from each entry in the /th row of A- Show that det(y4) = - [ det(i?i ) I det(i?2) ] • 



Let 



15. 



,4 = 



a n 


^12 


fll 3 " 


«21 


a 22 


a 23 


a 3 l 


a 32 


a 33 



(a) Express det(A/ - A) as a polynomial p(X) = A 3 4- b\ 2 -\-c\-\-d. 



(b) Express the coefficients £ and d in terms of determinants and traces. 



16. 



Without directly evaluating the determinant, show that 



sine* cos a sin(a + <5) 
sin ,3 cos ,3 sin(,;3 4- S) 
sin 7 cos 7 sin(7 + <5) 



= 



17. 



Use the fact that 21,375, 38,798, 34,162, 40,223, and 79,154 are all divisible by 19 to show that 

2 13 7 5 

3 8 7 9 8 

3 4 16 2 

4 2 2 3 
7 9 15 4 

is divisible by 19 without directly evaluating the determinant. 



18. 



Find the eigenvalues and corresponding eigenvectors for each of the following systems. 



(a) *2-l-9;t3=Aj:i 

*l + 4x2 -7x3=A*2 
*l — 3*3= Ax 3 



(b) * 2 +*3=A*i 

x\ -* 3 =A*2 

x\ + 5^2 + 3^3 = Xx2 
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Chapter 2 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Section 2.1 

Tl. (Determinants) Read your documentation on how to compute determinants, and then compute several determinants. 



T2. (Minors, Cofactors, and Adjoints) Technology utilities vary widely in their treatment of minors, cofactors, and adjoints. 
For example, some utilities have commands for computing minors but not cofactors, and some provide direct commands for 
finding adjoints, whereas others do not. Thus, depending on your utility, you may have to piece together commands or do 
some sign adjustment by hand to find cofactors and adjoints. Read your documentation, and then find the adjoint of the matrix 
A in Example 6. 



Use Cramer's rule to find a polynomial of degree 3 that passes through the points (0, 1), (l ? — 1), (2, — 1)> an d (3, 7). Verify 
your results by plotting the points and the curve on one graph. 



T3 
Section 2.2 



Tl. (Determinant of a Transpose) Confirm part (b) of Theorem 2.2.3 using some matrices of your choice. 

Section 2.3 

Tl. (Determinant of a Product) Confirm Theorem 2.3.4 for some matrices of your choice. 

T2. (Determinant of an Inverse) Confirm Theorem 2.3.5 for some matrices of your choice. 

T3. (Characteristic Equation) If you are working with a CAS, use it to find the characteristic equation of the matrix A in 
Example 6. Also, read your documentation on how to solve equations, and then solve the equation <Jet(A/ — j4) = for the 
eigenvalues of A- 

Section 2.4 

Tl. (Determinant Formulas) If you are working with a CAS, use it to confirm the formulas in Example 7. Also, use it to obU 



the formula requested in Exercise 15 of Section 2.4. 



T2. (Simplification) If you are working with a CAS, read the documentation on simplifying algebraic expressions, and then use 
the determinant and simplification commands in combination to show that 



a 


b 


c 


d 


b 


a 


d 


— c 


c 


-d 


a 


b 


d 


c 


-b 


a 



= (a 2 + b 2 +c 2 + d 2 ) 



T3. 



Use the method of Exercise T2 to find a simple formula for the determinant 

(a + b) 2 c 2 c 2 



a 2 (b+c) 2 a 2 

b 2 b 2 (c I a) 2 
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3 



CHAPTER 



Vectors in 2-Space and 3-Space 



INTRODUCTION: Many physical quantities, such as area, length, mass, and temperature, are completely described 
once the magnitude of the quantity is given. Such quantities are called, scalars. Other physical quantities are not completely 
determined until both a magnitude and a direction are specified. These quantities are called vectors. For example, wind 
movement is usually described by giving the speed and direction, say 20 mph northeast. The wind speed and wind direction 
form a vector called the wind velocity. Other examples of vectors are force and displacement. In this chapter our goal is to 
review some of the basic theory of vectors in two and three dimensions. 

Note. Readers already familiar with the contents of this chapter can go to Chapter 4 with no loss of continuity. 
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3-1 In this section, vectors in 2-space and 3-space will be introduced geometrically, 

INTRODUCTION TO arithmetic operations on vectors will be defined, and some basic properties of 

x , P r - rn DC , r ^ pnMP - rDTr .. these arithmetic operations will be established. 



Geometric Vectors 

Vectors can be represented geometrically as directed line segments or arrows in 2-space or 3-space. The direction of the arrow 
specifies the direction of the vector, and the length of the arrow describes its magnitude. The tail of the arrow is called the initial 
point of the vector, and the tip of the arrow the terminal point. Symbolically, we shall denote vectors in lowercase boldface type 
(for instance, a, k, v, w, and x). When discussing vectors, we shall refer to numbers as scalars. For now, all our scalars will be 
real numbers and will be denoted in lowercase italic type (for instance, a, k, v, w, and x). 

If, as in Figure 3.1.1a, the initial point of a vector v is A and the terminal point is 5, we write 

y = AB 

Vectors with the same length and same direction, such as those in Figure 3. 1.1 b, are called equivalent. Since we want a vector to 
be determined solely by its length and direction, equivalent vectors are regarded as equal even though they may be located in 
different positions. If v and w are equivalent, we write 




'A 

(a) The vector AB 




{h) Equivalent vecturs 
Figure 3.1.1 



DEFINITION 



If v and w are any two vectors, then the sum v + w is the vector determined as follows: Position the vector w so that its initial 
point coincides with the terminal point of v. The vector v -f w is represented by the arrow from the initial point of v to the 
terminal point of w (Figure 3.1.2a). 
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(h) 


V + W = W + V 


Fig 


ure 


3.1.2 



In Figure 3.1. 2Z? we have constructed two sums, v | w (color arrows) and w | v (gray arrows). It is evident that 

v I w = w 4- v 

and that the sum coincides with the diagonal of the parallelogram determined by v and w when these vectors are positioned so 
that they have the same initial point. 

The vector of length zero is called the zero vector and is denoted by 0- We define 

0+v=v+Q=v 

for every vector v. Since there is no natural direction for the zero vector, we shall agree that it can be assigned any direction that 
is convenient for the problem being considered. If v is any nonzero vector, then _ v , the negative of v, is defined to be the vector 
that has the same magnitude as v but is oppositely directed (Figure 3.1.3). This vector has the property 

v4 C-v)=0 

(Why?) In addition, we define _ = 0- Subtraction of vectors is defined as follows: 




Figure 3.1.3 



The negative of v has the same length as v but is oppositely directed. 



DEFINITION 



If v and w are any two vectors, then the difference of w from v is defined by 

v — w = v 4 ( — w) 
(Figure 3.1.4a). 
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3.1.4 







To obtain the difference v — w without constructing _ ^-, position v and w so that their initial points coincide; the vector from the 
terminal point of w to the terminal point of v is then the vector v — yr (Figure 3.1 Ab). 



DEFINITION 



If v is a nonzero vector and k is a nonzero real number (scalar), then the product fry is defined to be the vector whose length is 
|jfc| times the length of v and whose direction is the same as that of v if fc > Q and opposite to that of v if fc < 0- We define 



Figure 3.1.5 illustrates the relation between a vector v and the vectors ^v, ( _ ]) v , 2v> and ( _ 3) v . Note that the vector ( _ ]) v 
has the same length as v but is oppositely directed. Thus ( _ ] ) v is just the negative of v; that is, 

(-\)Y=-Y 



f v <Y 




Figure 3.1.5 



A vector of the form fry is called a scalar multiple of v. As evidenced by Figure 3.1.5, vectors that are scalar multiples of each 
other are parallel. Conversely, it can be shown that nonzero parallel vectors are scalar multiples of each other. We omit the proof. 

Vectors in Coordinate Systems 

Problems involving vectors can often be simplified by introducing a rectangular coordinate system. For the moment we shall 
restrict the discussion to vectors in 2-space (the plane). Let v be any vector in the plane, and assume, as in Figure 3.1.6, that v has 
been positioned so that its initial point is at the origin of a rectangular coordinate system. The coordinates ( v 1? V2) of the terminal 
point of v are called the components ofv, and we write 

v=(vi,va ) 




Figure 3.1.6 



vi and V2 are the components of v. 



If equivalent vectors, v and w, are located so that their initial points fall at the origin, then it is obvious that their terminal points 
must coincide (since the vectors have the same length and direction); thus the vectors have the same components. Conversely, 
vectors with the same components are equivalent since they have the same length and the same direction. In summary, two 
vectors 

v=(vi,v 2 ) and w=(wi,W2) 

are equivalent if and only if 

vi =w\ and v 2 = W2 



The operations of vector addition and multiplication by scalars are easy to carry out in terms of components. As illustrated in 
Figure 3.1.7, if 

then 

y + w = (vi 4- w\, V2 + ^2) 



(i) 



U? t + Wplfn f40$) 




Figure 3.1.7 

If v = (vi, V2) and k is any scalar, then by using a geometric argument involving similar triangles, it can be shown (Exercise 16) 
that 



kv= (jfcvi,jfcv2) 



(2) 



(Figure 3.1.8). Thus, for example, if v = (1, — 2) and w = (7 ? g), then 

v + w=(l ? -2) i (7, 6) = (1+7, -2 1- 6) = (8, 4) 
and 

4v = 4(l, -2) = (4(l),4(-2)) = (4, -8) 
Since, v _ ^- = v | ( — 1 ) w» it follows from Formulas 1 and 2 that 

Y-W= (VI -WU V2-^2) 

(Verify.) 



(JtEJ,,JtlM 




Figure 3.1.8 



Vectors in 3-Space 



Just as vectors in the plane can be described by pairs of real numbers, vectors in 3-space can be described by triples of real 
numbers by introducing a rectangular coordinate system. To construct such a coordinate system, select a point 0, called the 
origin, and choose three mutually perpendicular lines, called coordinate axes, passing through the origin. Label these axes x, y, 
and z, and select a positive direction for each coordinate axis as well as a unit of length for measuring distances (Figure 3.1.9a). 
Each pair of coordinate axes determines a plane called a coordinate plane . These are referred to as the xy-plane, the xz-plane, 
and the T^-plane. To each point P in 3-space we assign a triple of numbers (x, y, z), called the coordinates ofP, as follows: Pass 
three planes through P parallel to the coordinate planes, and denote the points of intersection of these planes with the three 
coordinate axes by X, F, and Z (Figure 3.1.9b). The coordinates of P are defined to be the signed lengths 

x = OX, y = OY, z = OZ 

In Figure 3.1.10a we have constructed the point whose coordinates are (4, 5, 6) and in Figure 3.1. 10b the point whose coordinates 

are(_3,2, _4). 



'O 



if* 






p 








Y 


J 


O y 



\SX 



Figure 3.1.9 
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*(4.5.6) 



T — 7 — 7 — 7 — * * 




Figure 3.1.10 



tr— • (-.3. 2, -4) 



ib) 



Rectangular coordinate systems in 3-space fall into two categories, left-handed and right-handed. A right-handed system has the 
property that an ordinary screw pointed in the positive direction on the z-axis would be advanced if the positive x-axis were 
rotated 90°. toward the positive j-axis (Figure 3.1.11a); the system is left-handed if the screw would be retracted (Figure 
3.1.116). 




(a) Right-handed 







{h) Left-handed 
Figure 3.1.11 

Remark In this book we shall use only right-handed coordinate systems. 

If, as in Figure 3.1.12, a vector v in 3-space is positioned so its initial point is at the origin of a rectangular coordinate system, 
then the coordinates of the terminal point are called the components of v, and we write 

v=(vi,v 2 , v 3 ) 



(v lk v^v ? ) 




Figure 3.1.12 

If v = (v i , v% v\) an d w = (w\, W2, wi) are two vectors in 3-space, then arguments similar to those used for vectors in a plane 
can be used to establish the following results. 




EXAMPLE 1 Vector Computations with Components 



Ifv=(l, -3, 2) and w = (4, 2, l),then 

v+w=(5, -1,3), 2v=(2, -6,4), _ w =(-4, -2, - 1), 

v_w = v I (-w) = (-3, -5,1) 



Application to Computer Color Models 



(0,0. I] 




Rcil # 
UJXO) 



Green 



Yellow 

iLl.lH 



Colors on computer monitors are commonly based on what is called the RGB color model. Colors in this system are created 
by adding together percentages of the primary colors red (R), green (G), and blue (B). One way to do this is to identify the 
primary colors with the vectors 

r = (1,0,0) (pure red), 
g= (0, 1, 0) (pure green), 
b = (0, 0, 1) (pure blue) 

in r 3 and to create all other colors by forming linear combinations of r, g, and b using coefficients 
between and 1, inclusive; these coefficients represent the percentage of each pure color in the 
mix. The set of all such color vectors is called RGB space or the RGB color cube. Thus, each color 
vector c in this cube is expressible as a linear combination of the form 

c =cir + C2g+ c^h 
= ci(l,0,0) I c 2 (0, 1,0) I c 3 (0,0, 1) 

where 0<cj<l- As indicated in the figure, the corners of the cube represent the pure primary colors 
together with the colors, black, white, magenta, cyan, and yellow. The vectors along the diagonal 
running from black to white correspond to shades of gray. 



Sometimes a vector is positioned so that its initial point is not at the origin. If the vector p7p 9 has initial point Pi(x\,yi,zi) and 
terminal point ^(^2, 72, z 2)> ^ en 



^1^2 = 02 -*1> 72 -71,^2-^1) 

That is, the components of p7p 9 are obtained by subtracting the coordinates of the initial point from the coordinates of the 
terminal point. This may be seen using Figure 3.1.13: 



WtfMSi ?3_ gj***> 




Figure 3.1.13 



The vector p7p 9 is the difference of vectors opt and ^p! , so 



^1^2 = 0^2-0^1 = (*2> 72,^2) - (*l,7l,^l) = (^2-^1,72-71^2-^1) 



EXAMPLE 2 Finding the Components of a Vector 



The components of the vector y _ p7p 9 with initial point Pi (2, —1,4) an d terminal point /^ (7, 5, 

v= (7 — 2, 5 — ( — 1), (-8) -4) = (5, 6, -12) 



8) are 



In 2-space the vector with initial point P\(x\,y\) and terminal point /^(^ 72) * s 



^1^2= (^2-^h72-7l) 



Translation of Axes 

The solutions to many problems can be simplified by translating the coordinate axes to obtain new axes parallel to the original 
ones. 

In Figure 3.1.14a we have translated the axes of an ^-coordinate system to obtain an x*y '-coordinate system whose origin 0/ is 
at the point (^ y) = (fc ? /). A point P in 2-space now has both (x, y) coordinates and (x , y ) coordinates. To see how the two are 
related, consider the vector Q f p (Figure 3.1. 14b). In the xy-system its initial point is at k, I) and its terminal point is at (x), (y), so 
O f P = (x — k, y — /)• In the x '7 '-system its initial point is at (0, 0) and its terminal point is at, (x f , y*), so O f P = (x f y f )- 
Therefore, 

x f = x-k, 7' =y- 1 

These formulas are called the translation equations. 
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Figure 3.1.14 



EXAMPLE 3 Using the Translation Equations 



j*j 



Suppose that an ^-coordinate system is translated to obtain an x y -coordinate system whose origin has ^-coordinates 

(*,/) = (4,1). 



J^J 



(a) Find the x y -coordinates of the point with the ^-coordinates P(2, 0)- 



tt 



(b) Find the ^-coordinates of the point with x y -coordinates g( — 1, 5) 



Solution (a) 

The translation equations are 

x f =x-4, y f =y-\ 

so the ^'-coordinates of p(2, 0) are x f = 2 — 4 = — 2 and y f = — 1 = — 1. 

Solution (b) 

The translation equations in (a) can be rewritten as 

so the ^-coordinates of Q are x = — 1 I- 4 = 3 and y = 5 _|_ l = g. 



In 3 -space the translation equations are 



x =x — k, y =y—l, z =z — m 



where (k, I, m) are the xyz-coordinates of the x y z -origin. 



Exercise Set 3.1 



Click here for Just Ask! 



Draw a right-handed coordinate system and locate the points whose coordinates are 
1. 

(a) (3,4,5) 

(b) (-3,4,5) 

(c) (3, _4,5) 

(d) (3, 4, _ 5) 

(e) (_3, -4,5) 

(f) (_3,4, _ 5 ) 

(g) (3, -4,-5) 
(h) (_3, -4, -5) 
(i) (-3,0,0) 

(j) (3,0,3) 
(k) (0, 0, _ 3) 
(1) (0,3,0) 

Sketch the following vectors with the initial points located at the origin: 
2. 

(a) vi = (3, 6) 

(b) v 2 =(-4, -8) 



3. 



(c) v 3 = (-4, -3) 

(d) v 4 =(5, -4) 

(e) v 5 = (3,0) 

(f) v 6 = (0, -7) 

(g) v 7 = (3, 4, 5) 
(h) v s = (3, 3, 0) 
(i) v g = (0, 0, - 3) 

Find the components of the vector having initial point p^ and terminal point _p 2 - 

(a) ^(4,8)^2(3,7) 

(b) P x (3, -5),P 2 (-4, -7) 

(c) P!(-5,0),P 2 (-3, 1) 

(d) P 1 (0,0),P 2 (a,b) 

(e) Pj (3, -7,2), (-2,5, -4) 

(f) Pi(-1,0, 2),P 2 (0, -1,0) 

(g) P l (a,b,c),P 2 (0,0,0) 
(h) PiCO.O.OJ.PaCa.i.c) 

Find a nonzero vector u with initial point P( — 1,3, — 5) such that 
4. 

(a) « has the same direction as v = (6, 7, — 3) 



(b) u is oppositely directed to v = (6, 7, — 3) 

Find a nonzero vector u with terminal point Q(2 9 0, — 5) suc h that 

^» 

(a) u has the same direction as v = (4, — 2, — 1) 

(b) u is oppositely directed to v = (4, — 2, — 1) 

Let u = ( _ 3, 1, 2), v = (4, 0, - 8), and w = (6, - 1, - 4). Find the components of 



6. 



7. 



(a) v-w 

(b) 6u I 2v 

(c) - v + u 

(d) 5(v-4u) 

(e) _3(v-8w) 

(f) (2u-7w)-(8v + u) 

Let w, v, and h> be the vectors in Exercise 6. Find the components of the vector x that satisfies 2u — v H- x = 7x 4- w- 



Let w, v, and w be the vectors in Exercise 6. Find scalars c\, c 2 , an ^ ^3 suc h ^at 

8 * ciu + ^ + ^w — (2, 0,4) 



Show that there do not exist scalars ci, C2> an d ^3 suc h ^at 
9 - ciC-2,9,6) I c 2 (-3,2, 1) I c 3 (l,l,5) = (0,5,4) 



Find all scalars c\,C2, and c^ such that 
10 " ci(l, 2, 0) I c 2 (2, 1, 1) I c 3 (0, 3, 1) = (0, 0, 0) 



Let P be the point (2, 3, _ 2) and Q the point (7, _ 4, 1). 
11. 



(a) Find the midpoint of the line segment connecting P and Q. 



(b) Find the point on the line segment connecting P and Q that is j of the way from P to Q. 



12. 



Suppose an ^-coordinate system is translated to obtain an x y -coordinate system whose origin O 1 has ^-coordinates (2, 
-3). 



y„t 



(a) Find the x y -coordinates of the point P whose ^-coordinates are (7, 5). 



tt 



(b) Find the ^-coordinates of the point Q whose x y -coordinates are ( _ 3, 6) 



' .■' 



(c) Draw the xy and x y -coordinate axes and locate the points P and Q. 



tt 



(d) If v = (3, 7) is a vector in the ^-coordinate system, what are the components of v in the x y -coordinate system? 



/ / 



(e) If v = (vi , v?) is a vector in the ^y-coordinate system, what are the components of v in the x y -coordinate system? 



13. 



Let P be the point (1, 3, 7). If the point (4, 0, _ 5) is the midpoint of the line segment connecting P and g, what is Ql 



KJJ 



Suppose that an xyz -coordinate system is translated to obtain &nx y z -coordinate system. Let v be a vector whose 
components are v = (y\, vj, V3) in the xyz-system. Show that v has the same components in the x'j/^'-system. 



15. 



Find the components of u, v, u | y, and u _ v for the vectors shown in the accompanying figure. 



gtP 




\ u 



j — * 



i\ 



Figure Ex-15 

Prove geometrically that if v = (vi, V2), then kv = (kv\, kv2)- (R estr i ct th e proof to the case fc > Q illustrated in Figure 3.1. 
!"• The complete proof would involve various cases that depend on the sign of k and the quadrant in which the vector falls.) 



Discussion 

M~™*r n. " Figure 3 ' '' 13 ' D,scuss a f ome,ric irterpre,a,ion of ,he vec,or 

n = OP l + \(OP 2 - OP 1) 



18. 



Draw a picture that shows four nonzero vectors whose sum is zero. 



If you were given four nonzero vectors, how would you construct geometrically a fifth vector that 

19. is equal to the sum of the first four? Draw a picture to illustrate your method. 

Consider a clock with vectors drawn from the center to each hour as shown in the accompanying 

20. figure. 



(a) What is the sum of the 12 vectors that result if the vector terminating at 12 is doubled in 
length and the other vectors are left alone? 



(b) What is the sum of the 12 vectors that result if the vectors terminating at 3 and 9 are each 
tripled and the others are left alone? 



(c) What is the sum of the 9 vectors that remain if the vectors terminating at 5, 11, and 8 are 
removed? 




Figure Ex-20 



21. 



Indicate whether the statement is true (T) or false (F). Justify your answer. 



(a) If x + y = x + z, then y = x. 



(b) If „ _|_ v _ o, then au + /> v = o for all a and b. 



(c) Parallel vectors with the same length are equal. 



(d) If ax = 0, then either a = or x = 0. 



(e) If aw + bv = 0, then u and v are parallel vectors. 



(f) 



The vectors u = (1/2 i/3) and v = —j=, -z\f 3 are equivalent 
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3.2 

NORM OF A VECTOR " ^ n ^ ls sectlon we s ^ a " establish the basic rules of vector arithmetic. 

VECTOR ARITHMETIC 



Properties of Vector Operations 

The following theorem lists the most important properties of vectors in 2-space and 3-space. 
THEOREM 3.2.1 



Properties of Vector Arithmetic 

Ifu, v, and w are vectors in 2- or 3-space and k and I are scalar s, then the following relationships hold. 

(a) u I v = v I u 

(b) (u hv)+w = u+(v+w) 

(c) n 1(1=0 4- ii = n 

(d) u | (-ii) =0 

(e) jt(Ai) = (JW)ii 

(f) k{\\ I v) = ka + kv 

(g) (k I I)u = hL + Iu 
(h) /u = u 



Before discussing the proof, we note that we have developed two approaches to vectors: geometric, in which vectors are 
represented by arrows or directed line segments, and analytic, in which vectors are represented by pairs or triples of numbers 
called components. As a consequence, the equations in Theorem 1 can be proved either geometrically or analytically. To 
illustrate, we shall prove part (b) both ways. The remaining proofs are left as exercises. 



Proof of part (b) (analytic) We shall give the proof for vectors in 3-space; the proof for 2-space is similar. If u = (u \ 9 U2, U3) > 



v= (vi, v 2 , v 3 ),and w = (w u w 2 , w 3 ), then 



(u + v) + w = [(ui,H2> u 3 ) I Cvi ? v 2? v 3 )] I (wi,W2,W3) 
= (ui4 vi,u 2 I v 2 , w 3 , I v 3 ) I (wi,W2, W3) 
= (["1 + v ll + ™1> ["2 I v 2] I w 2> l>3 I v 3] I W3) 
= ("1 + [ v l I ™l]>"2 I [ v 2 I W2]-"3 I [ v 3 I W3]) 
= (wi,M2- M 3) I (vi I >^i ? V2 I >^2 7 v 3 f W3) 
= u 4- (v 4- w) 



Proof of part (b) (geometric) Let w, v, and w be represented by pQ, QR, and ^ as shown in Figure 3.2.1. Then 



Also, 
Therefore, 



v + w= QS and u + (v 4- w) = PS 

vl + y = PR and (11 + v) + w= PS 

u 4- (v 4- w) = (u 4- v) 4- w 
Q 




Figure 3.2.1 



The vectors u | ( v | w) and ( u | v ) I w are equal. 



Remark In light of part (b) of this theorem, the symbol u I v I w is unambiguous since the same sum is obtained no matter 
where parentheses are inserted. Moreover, if the vectors w, v, and w are placed "tip to tail," then the sum u + v I w is the vector 
from the initial point of u to the terminal point of w (Figure 3.2.1). 

Norm of a Vector 

The length of a vector u is often called the norm of u and is denoted by ||u||. It follows from the Theorem of Pythagoras that the 
norm of a vector u — [ U \ r y 2 ) i n 2-space is 



Hull = Juj+uj 



(1) 



(Figure 3.2.2a). Let u = (tt\, u-i, ttj) be a vector in 3-space. Using Figure 3.2.2Z? and two applications of the Theorem of 
Pythagoras, we obtain 



\u\\ 2 =(OR) 2 I {RP) 2 = (OQ) 2 + (OS) 2 I (RP) 2 =u 2 I a| I u] 



Thus 



lull = /a? + aj + «3 



(2) 



A vector of norm 1 is called a unit vector. 



\u\s »:> 




(«) 



H,UuU V if$ 




m 



Figure 3.2.2 



Global Positioning 

GPS () is the system used by the military, ships, airplane pilots, surveyors, utility companies, automobiles, and hikers to 
locate current positions by communicating with a system of satellites. The system, which is operated by the U.S. Department 
of Defense, nominally uses 24 satellites that orbit the Earth every 12 hours at a height of about 1 1,000 miles. These satellites 
move in six orbital planes that have been chosen to make between five and eight satellites visible from any point on Earth. 



North Pole 




Equator 



To explain how the system works, assume that the Earth is a sphere, and suppose that there is an ^z-coordinate system with 
its origin at the Earth's center and its z-axis through the North Pole. Let us assume that relative to this coordinate system a 
ship is at an unknown point (x, y 9 z) at some time t. For simplicity, assume that distances are measured in units equal to the 
Earth's radius, so that the coordinates of the ship always satisfy the equation 

x 2 +y 2 +z 2 = \ 



The GPS identifies the ship's coordinates (x, y, z) at a time t using a triangulation system and computed distances from four 
satellites. These distances are computed using the speed of light (approximately 0.469 Earth radii per hundredth of a second) 
and the time it takes for the signal to travel from the satellite to the ship. For example, if the ship receives the signal at time t 
and the satellite indicates that it transmitted the signal at time ^ Q , then the distance d traveled by the signal will be 



^ = 0.469(^-^o) 

In theory, knowing three ship-to-satellite distances would suffice to determine the three unknown coordinates of the ship. 
However, the problem is that the ships (or other GPS users) do not generally have clocks that can compute t with sufficient 
accuracy for global positioning. Thus, the variable t must be regarded as a fourth unknown, and hence the need for the 
distance to a fourth satellite. Suppose that in addition to transmitting the time ^ , each satellite also transmits its coordinates ( 
*0> yn> z o) at that time, thereby allowing d to be computed as 

d = \j(x-xv) 2 i O-^o) 2 I (*-*o) 2 

If we now equate the squares of d from both equations and round off to three decimal places, then we obtain the 
second-degree equation 

(*-*n) 2 I Cy-yn) 2 I (z-z n ) 2 = 0.22(*-* n ) 2 
Since there are four different satellites, and we can get an equation like this for each one, we can produce four equations in 
the unknowns jc, y, z, and ^ Q . Although these are second-degree equations, it is possible to use these equations and some 
algebra to produce a system of linear equations that can be solved for the unknowns. 



HPl(xi 9 y\,zi) and p 2 (^ y 2 , 27) are two P°i nts i n 3-space, then the distance d between them is the norm of the vector p7p 7 
(Figure 3.2.3). Since 



P\P2 = (x2-xuy2-yuz2-zi) 

it follows from 2 that 

d=\l(x 2 -xi) 2 I &2-yi) 2 I (*2-*l) 2 
Similarly, ^ P\(x\ 7 y\) and p 2 (x 2r y 2) are P°i nts i n 2-space, then the distance between them is given by 

d = \j(x 2 -xi) 2 +(y2-yi) 2 



(3) 



(4) 



ft»*-M) 



i\^\->v'0 



Figure 3.2.3 



The distance between j p 1 and p 2 is the norm of the vector p~p . 



EXAMPLE 1 Finding Norm and Distance 



The norm of the vector u = ( — 3, 2, 1 ) is 

ll«ll = l/(-3) 2 I (2) 2 I (1) 2 = /14 
The distance d between the points Pi (2, — 1, — 5) an d ^(4, — 3, 1) * s 

d = \J(4-2) 2 I (-3 I l) 2 -h (1 -h 5) 2 = v/44 = 2/TT 



From the definition of the product £u, the length of the vector k\\ is \k\ times the length of u. Expressed as an equation, this 
statement says that 

||Au|| = |*|INI (5) 

This useful formula is applicable in both 2-space and 3-space. 



Exercise Set 3.2 

9- 



Click here for Just Ask! 



Find the norm of v. 
1. 



(a) v =(4, -3) 

(b) v=(2,3) 

(c) v =(_5,0) 

(d) v=(2,2,2) 

(e) v=(-7,2, -1) 

(f) v = (0,6,0) 

Find the distance between p, and p-j. 
2. 

(a) P l (3,4),P 2 (5,l) 

(b) J P 1 (-3,6),P 2 (-1, -4) 

(c) P x (l, -5,l),P 2 (-l, -2,-1) 

(d) ^(3,3,3)^2(6,0,3) 

Let u = (2 -2 3).v=(l -3 4).w=(3 6 — 4)- In each part, evaluate the expression. 
3. 



(a) I 

(b) I 

(c) I 

(d) I 

(e) - 



(f) | 



u + v|| 

«ll i IMI 

-2u|| + 2||n|| 
3u — 5v 4- w|| 

1 ... 



|u-| 



IMI 



Ml 



If || v || = 2 and ||w|| =3, what are the largest and smallest values possible for ||v — w||? Give a geometric explanation of your 
4. results. 

Let u = (2, 0, 4) and v = (1, 3, — 6). In each of the following, determine, if possible, scalars k, I such that 

(a) jtu I Iy=(5,9, -14) 

(b) kn I /v=(9, 15, -21) 

Let u = (2, 6, - 7), v = ( - 1, - 1, 3), and k = 3- If (2, 14, 1 1) = jfcu + /v, what is the value of /? 



Let v = ( — 1, 2, 5). Find all scalars k such that ||£v|| = 4. 



Let u = (7 ? _ 3 ? 1), v = (9, 6, 6), w = (2, 1, — 8), k = — 2. and / = 5. Verify that these vectors and scalars satisfy the stated 
"• equalities from Theorem 1 . 

(a) part (b) 



(b) part (e) 

(c) part(/) 

(d) part(g) 



9. 



( a ) Show that if v is any nonzero vector, then -rr-iry is a unit vector. 



(b) Use the result in part (a) to find a unit vector that has the same direction as the vector v = (3, 4)- 



(c) Use the result in part (a) to find a unit vector that is oppositely directed to the vector v = ( — 2, 3, — 6)- 



10. 



(a) Show that the components of the vector v = (vi, V2) in Figure Ex-lOa are Vl — [| v [|cos 9 anc * V2 = ||v||sinfl- 



(b) Let u and v be the vectors in Figure Ex-10Z?. Use the result in part (a) to find the components of 4 U _ 5 V . 



J 


it 











*>' 



/k 



I fa- 
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^k 



hs^M-i 



;2 f3 



Figure Ex-10 



\b\ 



11. 



Let p = ( X q, ya, z ) and p = ( x , y, £)■ Describe the set of all points (x, y, z) for which ||p _ p || = 1. 



12. 



Prove geometrically that if u and v are vectors in 2- or 3-space, then ||u + v|| < ||u|| + ||v|| 



13. 



Prove parts (a), (c), and (e) of Theorem 1 analytically. 



14. 



Prove parts (d), (g), and (h) of Theorem 1 analytically. 



Discussion 

DisCOV&FV ^ or tne met l uant y stated in Exercise 9, is it possible to have ||u + v|| : : ||u|| + ||v||? Explain your 



15. 



reasoning. 



16. 



(a) What relationship must hold for the point p = (a, b, c) to be equidistant from the origin 
and the ^j?-plane? Make sure that the relationship you state is valid for positive and 



17. 



negative values of a, b, and c. 



(b) What relationship must hold for the point p = (a, b, c) to be farther from the origin than 
from the ^-plane? Make sure that the relationship you state is valid for positive and 
negative values of a, b, and c. 



(a) What does the inequality ||x|| < 1 tell you about the location of the point x in the plane? 



(b) Write down an inequality that describes the set of points that lie outside the circle of 
radius 1, centered at the point xg. 



The triangles in the accompanying figure should suggest a geometric proof of Theorem 3.2.1 (/) 
18. for the case where fc > Give the proof. 




(a) 

Figure Ex-18 




feH + H 



(h) 
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3.3 

DOT PRODUCT; 
PROJECTIONS 



In this section we shall discuss an important way of multiplying vectors in 
2-space or 3-space. We shall then give some applications of this 
multiplication to geometry. 



Dot Product of Vectors 

Let u and v be two nonzero vectors in 2-space or 3-space, and assume these vectors have been positioned so that their initial 
points coincide. By the angle between u and v, we shall mean the angle determined by u and v that satisfies < 8 < tt (Figure 

3.3.1). 





O O- 




Figure 3.3.1 



The angle between u and v satisfies < 9 < x. 



DEFINITION 




If u and v are vectors in 2-space or 3- 
product u . v is defined by 


space and 6 is the angle between u and v, then the dot product or Euclidean inner 

J||u||||v||cos£P ifu*0andv#0 
U '^~\0 if u = 0orv = (1) 



EXAMPLE 1 Dot Product 



As shown in Figure 3.3.2, the angle between the vectors u = (Q ? 0, 1) and v — (0 ? 2, 2) is 45°. Thus 



11- V= 11 V cos 



0=(l/o 2 + O 2 4 l 2 )Cl/0 2 I 2 2 I 2 2 )M=| = 2 



<O r 2,2, 




Figure 3.3.2 



Component Form of the Dot Product 

For purposes of computation, it is desirable to have a formula that expresses the dot product of two vectors in terms of the 
components of the vectors. We will derive such a formula for vectors in 3-space; the derivation for vectors in 2-space is 
similar. 

Let \i = (u\ r U2, M3) an d v= (vi, V2 ? V3) be two nonzero vectors. If, as shown in Figure 3.3.3, 6 is the angle between u and v, 
then the law of cosines yields 



\\PQW = ||u|| 2 +||v|| 2 -2||u||||v|| cos* 



(2) 



Since PQ = v — u, we can rewrite 2 as 



or 



Substituting 



and 



we obtain, after simplifying, 



H|||v||co S = I(||u|| 2 +||v|| 2 -||v-u|| 2 ) 



u-v=I(||u|| 2 + ||y|| 2 -||v-u|| 2 ) 



\\u\\ 2 = uj + uj + ul ||v|| 2 = v 2 + v 2 + v 2 



||v-u|| 2 = (vi-ai) 2 I (v 2 -u 2 ) 2 I O3-W3) 2 



u-v = uivi I U2 V 2 I ^3^3 



(3) 



Although we derived this formula under the assumption that u and v are nonzero, the formula is also valid if u = or v — 
(verify). 




Figure 3.3.3 



If u = {u\, U2) an ^ v = (vi„ V2) are two vectors in 2-space, then the formula corresponding to 3 is 



U'Y = u\vi + ^2 V 2 



(4) 



Finding the Angle Between Vectors 

If u and v are nonzero vectors, then Formula 1 can be written as 



CQS0- 



11 - V 



u V 



(5) 



EXAMPLE 2 Dot Product Using (3) 



Consider the vectors u=(2, — 1, 1) and v = (1, 1, 2)- Find u - v and determine the angle between u and v. 
Solution 

u-v = uivi I u 2 v 2 l-«3V3 = (2)(l) I (-1)0) I (1)(2) = 3 
For the given vectors we have ||u|| = ||v|| = J~6, so from (5), 

a _ n- v _ 3 _ X 

INI IN ^ 2 

Thus, f9 = 60 c . 



EXAMPLE 3 A Geometric Problem 



Find the angle between a diagonal of a cube and one of its edges. 



Solution 



Let k be the length of an edge and introduce a coordinate system as shown in Figure 3.3.4. If we let Ul = (k, 0, 0), 
u 2 = (0, k, 0), and U3 = (0, 0, k), then the vector 

cl = (k, k, k) = \\\ 4- ii2 + u 3 
is a diagonal of the cube. The angle 6 between d and the edge ui satisfies 



co S ^^ d k2 ] 



Nilllldll {k){{^) ft 
Thus 



9 = cos -1 (-1= Us 54.74° 



/3 

Note that this is independent of k, as expected. 



[QAlAi 




V a-. 0,0) 
Figure 3.3.4 



The following theorem shows how the dot product can be used to obtain information about the angle between two vectors; it 
also establishes an important relationship between the norm and the dot product. 



THEOREM 3.3.1 



Let u and v be vectors in 2- or 3-space. 



(a) v . v = || v || 2 ; that is, ||v|| = (v - v) 



1/2 



(b) If the vectors u and v are nonzero and 9 is the angle between them, then 



9 is acute 


if and only if n - v > 


9 is obtuse 


if and only if u - v < 


9 = W2 


ifa?zd ottly jf u - v = 



Proof (a) Since the angle 6 between v and v is 0, we have 



v-v= ||v||||v|| cosfl= ||v|] 2 cosO = ||v|| 2 



Proof (b) Since 6 satisfies < 9 < tt, it follows that 6 is acute if and only if cos > 0. that 6 is obtuse if and only if cos < 0, 
and that = n / 2 if and only if C os 9 = 0- But cos has the same sign as u . v since u v = ||u|| ||v||cos 9, ||u|| > 0, and ||v|| > 0. 
Thus, the result follows. 



EXAMPLE 4 Finding Dot Products from Components 



If u = (1, _ 2, 3), v = ( - 3, 4, 2), and w = (3, 6, 3), then 



u-v = (l)(-3) I (-2)(4) + (3)(2)=-5 
v-w=(-3)(3) I (4) (6) I (2)(3)=21 
u-w=(l)(3) I (-2) (6) I (3)(3) = 

Therefore, u and v make an obtuse angle, v and w make an acute angle, and u and w are perpendicular. 

i 

Orthogonal Vectors 

Perpendicular vectors are also called orthogonal vectors. In light of Theorem 3.3.1 b, two nonzero vectors are orthogonal if 
and only if their dot product is zero. If we agree to consider u and v to be perpendicular when either or both of these vectors is 
0, then we can state without exception that two vectors u and v are orthogonal (perpendicular) if and only ifu . v = 0- To 
indicate that u and v are orthogonal vectors, we write u _l v . 



EXAMPLE 5 A Vector Perpendicular to a Line 



Show that in 2-space the nonzero vector n = ( fl? £) is perpendicular to the line a x \- by + c = 0- 

Solution 

Let Pi(x\,yi) and p 2 (xj, y 2) ^ e distinct points on the line, so that 

ax 1 I by 1 I c = 

ax 2 +by 2 + c = (6) 

Since the vector p7p~ — s Xo _ X] y? _ y \ runs along the line (Figure 3.3.5), we need only show that n and j p7pl are 
perpendicular. But on subtracting the equations in (6), we obtain 

a(x 2 -z\) I £(y2-7l) = 
which can be expressed in the form 



(a,b)-(x2-x\,y2-y\)=§ or n-PiP 2 = § 



Thus n and p<p 7 are perpendicular. 




Figure 3.3.5 

4 

The following theorem lists the most important properties of the dot product. They are useful in calculations involving vectors. 

THEOREM 3.3.2 



Properties of the Dot Product 



Ifu, v, and w are vectors in 2-or 3-space and k is a scalar, then 



(a) U-Y = Y-U 



(b) ii - (v + w) = u - v + u w 



(c) k(u-v) = (ku)-v = u-(kv) 



(d) v - v > ifv * 0, and v . v = ifv = 



Proof We shall prove (c) for vectors in 3-space and leave the remaining proofs as exercises. Let u = (u\, u 2 , M3) an d 
v = (y u v?? vi); then 

k(\\ - v) = jfc(zqvi 4- u 2 v 2 + W3V3) 

= (Aai)vH (ku 2 )v 2 I (ku 3 )v 3 



Similarly, 



i(u- v) =n- (£v) 



An Orthogonal Projection 

In many applications it is of interest to "decompose" a vector u into a sum of two terms, one parallel to a specified nonzero 
vector a and the other perpendicular to a. If u and a are positioned so that their initial points coincide at a point Q, we can 
decompose the vector u as follows (Figure 3.3.6): Drop a perpendicular from the tip of u to the line through a, and construct 
the vector w\ from Q to the foot of this perpendicular. Next form the difference 

W) = U — W1 

As indicated in Figure 3.3.6, the vector wi is parallel to a, the vector w 2 is perpendicular to a, and 

w l + W2 = wi + (u — wi ) = u 
The vector w\ is called the orthogonal projection of u on a or sometimes the vector component ofu along a. It is denoted by 



proj a ii 



(7) 



The vector W2 is called the vector component ofu orthogonal to a. Since we have w'2 = u — wi, this vector can be written in 
notation 7 as 

W2 = u — proj a ii 




Figure 3.3.6 





ia\ 



The vector u is the sum of wi and W2, where wi is parallel to a and w 2 is perpendicular to a. 



The following theorem gives formulas for calculating proj a ii and u — proj a ii- 
THEOREM 3.3.3 



Ifu and a are vectors in 2-space or 3-space and if^^Q, then 

proj a ii = a (vector component of u along a ) 

Hall 

u — proj a ii = u — u " a (vector component u orthogonal to a) 

l|a|| 





Proof Let Wl = proj fl u and W2 = u — proj a ii. Since w\ is parallel to a, it must be a scalar multiple of a, so it can be written in 
the form Wl — £ a . Thus 

Taking the dot product of both sides of 8 with a and using Theorems Theorem 3.3.1a and Theorem 
3.3.2 yields 

u - a = (£a + W2) ■ a = k\\ a|| + W2 - a (9) 

But w 2 ■ a = since v.- 2 is perpendicular to a; so 9 yields 



Since P roj a u=wi = £a, we obtain 



k=^r 
N| 2 



Ml 2 



EXAMPLE 6 Vector Component of u Along a 



Let XL = (2, — 1,3) and a = (4, —1,2)- Find the vector component of u along a and the vector component of u orthogonal to 
a. 



Solution 

u-a=(2)(4) I (-l)f-l) I (3)(2) = 15 
|| a || 2 =4 2 +(-l) 2 | 2 2 = 21 
Thus the vector component of u along a is 

pro Ja u = ^ a = If(4, -l,2) = (f, -f,f) 

||a|| 

and the vector component of u orthogonal to a is 



u-pro Ja u=(2, -l,3)-(f, -l.f ) = (-£. -1,11) 



As a check, the reader may wish to verify that the vectors u _ proj a ii and a are perpendicular by showing that their dot product 
is zero. 



A formula for the length of the vector component of u along a can be obtained by writing 



llproj a ii|| = 



ii- ^ 



ii*ii- 



ii- n 



iwr 

In - a I 



|| a || «— Formula (5) of Section 3.2 



|a|| <-Since||a|r >0 



which yields 



llproj a ii|| = 



|u - a | 



If. 6 denotes the angle between u and a, then u - a = ||u|| ||a|| cos ff, so 10 can also be written as 

llproj a u|| = ||u|||cosfl| 

(Verify.) A geometric interpretation of this result is given in Figure 3.3.7. 




lull COS ft 



(a) 0£fl< 



7T 



(10) 



(11) 




-lull cost* 



Figure 3.3.7 

As an example, we will use vector methods to derive a formula for the distance from a point in the plane to a line. 



EXAMPLE 7 Distance Between a Point and a Line 



Find a formula for the distance D between point PqC^O, Jo) anc ^ the line ax I by \-c = 0- 



Solution 



Let Q(x\, y\ ) be any point on the line, and position the vector n = (a, b) so that its initial point is at Q. 

By virtue of Example 5, the vector n is perpendicular to the line (Figure 3.3.8). As indicated in the figure, the distance D is 
equal to the length of the orthogonal projection of npl on n; thus, from 10, 

D=\\pro in QP Q \\= I ||n|| 



But 



QPq =Oo-*i..yo-.yi) 



so 



||n|| = \la 2 + b 2 



D _ K*0~*l) I ^0o-7l)| 



Since the point Q{x\,y\) lies on the line, its coordinates satisfy the equation of the line, so 

ax i + b y i + c = or c = — ax 1 — b y 1 
Substituting this expression in 12 yields the formula 




Figure 3.3.8 



(12) 



(13) 



EXAMPLE 8 Using the Distance Formula 



It follows from Formula 13 that the distance D from the point (1, — 2) to the line 3% 4. 4y _ 6 = is 

D _ |(3)(1) I 4(-2)-6| _ |-11| _ n 



& 



+ 4' 



/25 



Exercise Set 3.3 



© 



Click here for Just Ask! 



Find u . v 
1. 



(a) u=(2, 3),v=(5, -7) 

(b) u =(-6, -2>v=(4, 0) 

(c) u=(l. -5,)4,v=(3,3,3) 

(d) u =(-2, 2, 3),v=(l,7, -4) 

In each part of Exercise 1, find the cosine of the angle between u and v. 
2. 



Determine whether u and v make an acute angle, make an obtuse angle, or are orthogonal. 
3. 



(a) u =(6,1.4),v=(2.0. -3) 

(b) u=(0,0, -l),v=(l,l, 1) 

(c) u=(-6, 0,4),v=(3, 1,6) 

(d) u=(2,4, -8),v=(5,3,7) 

Find the orthogonal projection of u on a. 
4. 

(a) u= (6. 2), a = (3, -9) 

(b) u=(-l, -2), a = (-2, 3) 

(c) u=(3, 1, -7), a = (1,0, 5) 

(d) u= (1,0.0), .= (4,3. 8) 

In each part of Exercise 4, find the vector component of u orthogonal to a. 

5. 



In each part, find ||proj a u||- 
6. 



(a) u=(l, -2),a=(-4, -3) 

(b) u=(5, 6), a = (2, -1) 

(c) u= (3, 0,4), a =(2, 3, 3) 

(d) u =(3, -2,6),a=(l,2, -7) 

Let u = (5_ _ 2, 1), v = (1, 6, 3), and k — — 4- Verify Theorem 3.3.2 for these quantities. 
7. 



8. 

(a) Show that v = (a, b) and w = ( _ b, a) are orthogonal vectors. 



(b) Use the result in part(a) to find two vectors that are orthogonal to v = (2, — 3)- 

(c) Find two unit vectors that are orthogonal to ( _ 2, A)- 

Let n -(3 4),v=(5 — 1 ) , and w — n \). Evaluate the expressions. 
9. ' ' 

(a) u-(7v I w) 

(b) ||(u-w)w|| 

(c) ||u||(v-w) 

(d) (||u||v)- W 

Find five different nonzero vectors that are orthogonal to u = (5 — 2 3) • 
10. 

Use vectors to find the cosines of the interior angles of the triangle with vertices (0, — 1)> (1, — 2), and (4 ? 1). 



Show that A (3, 0, 2), B (4, 3, 0), and C (8, 1, — 1) are vertices of a right triangle. At which vertex is the right angle? 
12. 

Find a unit vector that is orthogonal to both u = ( 1 1 ) and v = (0 1 1 )• 
13. 

A vector a in the xy -plane has a length of 9 units and points in a direction that is 120° counterclockwise from the positive 

14. x-axis, and a vector b in that plane has a length of 5 units and points in the positive y-direction. Find a . b- 

A vector a in the ;ty-plane points in a direction that is 47° counterclockwise from the positive x-axis, and a vector b in that 

15. plane points in a direction that is 43° clockwise from the positive x-axis. What can you say about the value of ^ . V? 

Let p = (2, k) and q = (3 ? 5). Find k such that 
16. 

(a) p and q are parallel 

(b) p and q are orthogonal 

(c) the angle between/; and q is n/3 

(d) the angle between/; and q is n/4 

Use Formula 13 to calculate the distance between the point and the line. 
17. 

(a) 4* + 3>> + 4 = 0; (-3, 1) 

(b) y= -4x | 2; (2, -5) 

(c) 3x+y = 5\(h8) 

Establish the identity ||u + v|| 2 + ||u- v|| 2 = 2||u|| 2 + 2||v|| 2 . 
18. 

Establish the identity u - v = - llu I vll 2 — - llu — v II 2 . 
19. 4 4 



Find the angle between a diagonal of a cube and one of its faces. 
20. 

Let i,y, and k be unit vectors along the positive x, y 9 and z axes of a rectangular coordinate system in 3-space. If 
21. v — ( a? b, c) is a nonzero vector, then the angles a, p, and y between v and the vectors i,j\ and k, respectively, are called 

the direction angles of v (see accompanying figure), and the numbers cos a, cos (3, and cos y are called the direction 



cosines of v. 



(a) Show that cos q = a I ||v|] 



(b) Find cos (3 and cos y. 



(c) Show that v / || v|| = (cos a, cos ft cos 7). 



(d) Show that cos 2 a. + cos 2 ,9 + cos 2 7=1. 




Figure Ex-21 

Use the result in Exercise 21 to estimate, to the nearest degree, the angles that a diagonal of a box with dimensions 10 cm 

22. x 15 cm x 25 cm makes with the edges of the box. 

Note A calculator is needed. 

Referring to Exercise 21, show that two nonzero vectors, v\ and \-2, in 3 -space are perpendicular if and only if their 

23. direction cosines satisfy 

coscticosct2 + cos i^l cos i^2 + COS71COS72 = 



24. 



(a) Find the area of the triangle with vertices A(2, 3), C(4, 7), and D( _ 5, 8). 



(b) Find the coordinates of the point B such that the quadrilateral ABCD is a parallelogram. What is the area of this 
parallelogram? 



25. 



Show that if v is orthogonal to both w\ and W2 ? then v is orthogonal to £ lWl | £ 2 W2 for all scalars ^ and £ 2 - 



Let w and v be nonzero vectors in 2- or 3-space, and let k = ||u|| and / = ||v||. Show that the vector w = hi + k\- bisects the 
26. angle between u and v. 



Discussion 

Discovery ^ n eac ^ P art ' something is wrong with the expression. What? 



27. 



(a) u - (v - w) 

(b) (u - v) + w 

(c) l|u-v|| 

(d) k-(u\ v) 



28. 



Is it possible to have p r0 j u = p r0 j a? Explain your reasoning. 



If u qt 0, is it valid to cancel u from both sides of the equation u . v = u w and conclude that 
29. v — w ? Explain your reasoning. 

Suppose that w, v, and w are mutually orthogonal nonzero vectors in 3-space, and suppose that 
30- you know the dot products of these vectors with a vector r in 3-space. Find an expression for r in 
terms of w, v, w, and the dot products. 

Hint Look for an expression of the form r = c \vl + ^v + C3W. 



Suppose that u and v are orthogonal vectors in 2-space or 3-space. What famous theorem is 
31- described by the equation ||u I v|| 2 = ||n|| 2 I ||v|| 2 ? Draw a picture to support your answer. 
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3.4 

CROSS PRODUCT 



In many applications of vectors to problems in geometry, physics, and 
engineering, it is of interest to construct a vector in 3-space that is perpendicular 
to two given vectors. In this section we shall show how to do this. 



Cross Product of Vectors 

Recall from Section 3.3 that the dot product of two vectors in 2-space or 3-space produces a scalar. We will now define a type of 
vector multiplication that produces a vector as the product but that is applicable only in 3-space. 



DEFINITION 



Ifu=(ui,U2» u 3) anc * v — ( v 1 > v 2> v 3) are vectors in 3-space, then the cross product u x v is the vector defined by 

uxv= (u2Vi-uiV2,uivi - U{V^ ? U{V 2 - u 2 v\) 
or, in determinant notation, 

uxv = 



^2 ^3 




u\ u-$ 




u\ ii 2 


v 2 v 3 


* 


vi v 3 


* 


vi v 2 



(1) 



Remark Instead of memorizing 1, you can obtain the components of u x v as follows: 



Form the 2 x 3 matrix 
components of v. 



u\ u 2 ^3 
vi V2 V3 



whose first row contains the components of u and whose second row contains the 



To find the first component of u x v, delete the first column and take the determinant; to find the second component, delete 
the second column and take the negative of the determinant; and to find the third component, delete the third column and take 
the determinant. 



EXAMPLE 1 Calculating a Cross Product 



Find u x v , where u = (1, 2, - 2) and v = (3, 0, !)• 



Solution 

From either 1 or the mnemonic in the preceding remark, we have 

2 -2 



uxv = 



1 
(2, -7, -6) 



1 2 
3 



There is an important difference between the dot product and cross product of two vectors — the dot product is a scalar and the 
cross product is a vector. The following theorem gives some important relationships between the dot product and cross product 
and also shows that u x v is orthogonal to both u and v. 

THEOREM 3.4.1 



Relationships Involving Cross Product and Dot Product 

Ifu, v, and w are vectors in 3-space, then 

(a) u - (u xv) = (u x v is orthogonal to u) 

(b) v - (u xv)=0 (u x v is orthogonal to v) 

(c) ||uxv|| = ||n|| ||v|| — (u-v) (Lagrange's identity) 

(d) u x (v x w) = (n - w)v — (n - v)w (relationship between cross and dot products) 

(e) (n x v) x w = (n - w)v — (v - w)n (relationsliip between cross and dot products) 



Proof (a) Let u= ( U{? U2? U3 ) and v = (vi, v 2 , v 3 ). Then 



u- (uxv) = (u\, U2, U3) - (^2 V 3 — ^3 V 2> U 3 V \ ~ u \ v 3> u \ v 2 ~ U 2 V \) 

= U{(u2V3-uiV2) I U2("3^l -UIV3) I "3(^1^2 ~ w 2^l) = 



Proof (b) Similar to (a). 



Proof (c) Since 



and 



|uxv|| 2 = (^2V3-^2) 2 i ("3^1 -^1V3) 2 I (^1^2 ~"2^l) 2 (2) 



||u|| 2 ||v|| 2 -Cu-v) 2 =(uf+u|+u|)Cvf I v 2 2 I v |)-(uivi I u 2 v 2 I ^ 3 v 3 ) 2 (3) 



the proof can be completed by "multiplying out" the right sides of 2 and 3 and verifying their equality. 



Proof (d) and (e) See Exercises 26 and 27. 




Joseph Louis Lagrange (1736-1813) was a French-Italian mathematician and astronomer. Although his father wanted him to 
become a lawyer, Lagrange was attracted to mathematics and astronomy after reading a memoir by the astronomer Halley. At 
age 16 he began to study mathematics on his own and by age 19 was appointed to a professorship at the Royal Artillery School 
inTurin. The following year he solved some famous problems using new methods that eventually blossomed into a branch of 
mathematics called the calculus of variations . These methods and Lagrange's applications of them to problems in celestial 
mechanics were so monumental that by age 25 he was regarded by many of his contemporaries as the greatest living 
mathematician. One of Lagrange's most famous works is a memoir, Meecanique Analytique, in which he reduced the theory of 
mechanics to a few general formulas from which all other necessary equations could be derived. 

Napoleon was a great admirer of Lagrange and showered him with many honors. In spite of his fame, Lagrange was a shy and 
modest man. On his death, he was buried with honor in the Pantheon. 



EXAMPLE 2 u x V Is Perpendicular to u and to v 



Consider the vectors 

u=(l,2 p -2) and y=(3.0,1) 
In Example 1 we showed that 

uxv=(2, -7, -6) 
Since 

u-(uxv) = (l)(2) + (2)(-7)4(-2)(-6)=0 

and 

v-(uxv) = (3)(2) + (0)(-7) | (l)(-6) = 

u x v is orthogonal to both u and v, as guaranteed by Theorem 3.4.1. 

The main arithmetic properties of the cross product are listed in the next theorem. 

THEOREM 3.4.2 



Properties of Cross Product 

Ifu, v, and w are any vectors in 3-space and k is any scalar, then 

(a) u x v = — (v x \i) 

(b) u x (v I w) = (uxv) + (u x w) 

(c) (u 4- v) x w = (u x w) 4- (v x w) 



(d) k(\\ x v) = (k\\) x v = ux (far) 



( e ) ux0 = 0xu = 



(f) u x u = 



The proofs follow immediately from Formula 1 and properties of determinants; for example, (a) can be proved as follows: 



Proof (a) Interchanging u and v in 1 interchanges the rows of the three determinants on the right side of 1 and hence changes the 
sign of each component in the cross product. Thus u x v = — (v x u). 



The proofs of the remaining parts are left as exercises. 



EXAMPLE 3 Standard Unit Vectors 



Consider the vectors 

i= (1,0,0), ]=(0, 1,0), k=(0, 0, 1) 

These vectors each have length 1 and lie along the coordinate axes (Figure 3.4.1). They are called the standard unit vectors in 
3-space. Every vector v — ( Vl? y 2 , V3) in 3-space is expressible in terms of ij, and k since we can write 

Y=(vi,V3,v?)=vin,0,CQ I v?(0, 1,0) I v^0,0, l)=vii + vii I v^k 
For example, 

(2, -3,4)=2i-3j I 4k 
From 1 we obtain 



1X] = 








1 




1 


1 


? 





? 


1 



= (0,0, l)=k 




Figure 3.4.1 



The standard unit vectors. 



The reader should have no trouble obtaining the following results: 

ixi = jxj = kxk = 

ixj = k jxk = i kxi = j 

jxi = — k kxj = — i ixk= — j 

Figure 3.4.2 is helpful for remembering these results. Referring to this diagram, the cross product of two consecutive vectors going 
clockwise is the next vector around, and the cross product of two consecutive vectors going counterclockwise is the negative of the 
next vector around. 




Figure 3.4.2 



Determinant Form of Cross Product 



It is also worth noting that a cross product can be represented symbolically in the form of a formal 3x3 determinant: 





i j k 












^2 ^3 




U\ ui 




u\ u 2 




UX\ T = 


Ml ^2 ^3 


— 


V2 v 3 


l — 


vi v 3 


J + 


vi v 2 


k 




vi V2 V3 







(4) 



For example, if u = (1, 2, - 2) and v = (3 ? 0, 1), then 



U X V = 



i j k 

1 2 -2 
3 1 



= 2i - 7 j - 6k 



which agrees with the result obtained in Example 1 . 



Warning It is not true in general that u x (v x w) = (u x v) x w- For example, 



and 



so 



ix(jxj) = i x = 



(ixj)xj = kxj= -i 



i x ( j x j) * (i x j ) x j 

We know from Theorem 3.4.1 that u x v is orthogonal to both u and v. If u and v are nonzero vectors, it can be shown that the 
direction of u x v can be determined using the following "right-hand rule"* (Figure 3.4.3): Let 9 be the angle between u and v, and 
suppose u is rotated through the angle 9 until it coincides with v. If the fingers of the right hand are cupped so that they point in the 



direction of rotation, then the thumb indicates (roughly) the direction of u x v- 



uxv 




Figure 3.4.3 

The reader may find it instructive to practice this rule with the products 

ixj = k ? jxk = i, kxi = j 

Geometric Interpretation of Cross Product 

If u and v are vectors in 3-space, then the norm of u x v has a useful geometric interpretation. Lagrange's identity, given in 
Theorem 3.4.1, states that 

||uxv|[ 2 =||u|| 2 |]v|| 2 - (u-v) 2 

If 9 denotes the angle between u and v, then u v = ||u|| ||v|| cos 8, so 5 can be rewritten as 

||uxv|| 2 = ||u|| 2 ||v|| 2 -||u|| 2 ||v|| 2 co S 2 
| 2 ||v|| 2 (l-cos 2 0) 
J 2 ||v|| 2 sin 2 
Since < B < n, it follows that sin# > 0, so this can be rewritten as 

||uxv|| = ||u||||v||sin0 

But ||v|| sin is the altitude of the parallelogram determined by u and v (Figure 3.4.4). Thus, from 6, the area A of this 
parallelogram is given by 

A= (base) (altitude) = ||u||||v|| sin#= ||uxv|| 



(5) 



(6) 




Figure 3.4.4 

This result is even correct if u and v are collinear, since the parallelogram determined by u and v has zero area and from 6 we have 
u x v = because Q = in this case. Thus we have the following theorem. 



THEOREM 3.4.3 



Area of a Parallelogram 

Ifu and v are vectors in 3-space, then ||u x v|| is equal to the area of the parallelogram determined by u and v. 



EXAMPLE 4 Area of a Triangle 



Find the area of the triangle determined by the points p^ (2, 2, 0)> /^t — 1> 0> 2), anc ^ ^3(0> ^* 3)* 



Solution 



The area A of the triangle is ^ the area of the parallelogram determined by the vectors p7p 9 and pTp^ (Figure 3.4.5). Using the 
method discussed in Example 2 of Section 3.1, pTp, — { _ 3 _ 2 2) anc ^ P]P^, = ( — 2 2 3 V ^ f°U° ws that 



and consequently, 



P 1 P 2 >cPiP3 = (-lO r 5 J -10) 



-* * 



^ = ^11^1^2X^1^11 = ^(15) = f 




fjffl.4, 3) 



/•,(2.a,0) 



Figure 3.4.5 



DEFINITION 



If w, v, and w are vectors in 3-space, then 

is called the scalar triple product of w, v, and w. 



u - (v x w) 



The scalar triple product of u = (u \, u% ^3)9 v = (v\ 7 v 2 , V3), and w = (n?i, >t>2> ™3) can ^ e calculated from the formula 

n - (v x w) = 

This follows from Formula 4 since 



u\ U2 


u 3 


vi v 2 


v 3 


w\ W2 


w>2 



(7) 



n - (v x w) = u 



v 2 v 3 
>^2 w 3 



l — 



vi v 3 



v 2 v 3 
W2 ™3 



"1 



vi v 3 



} I 

^2 I 



"1 


"2 


"3 


vi 


^2 


v 3 


wi 


w 2 


W3 



vi v 2 
vi v 2 



UJ 



EXAMPLE 5 Calculating a Scalar Triple Product 



Calculate the scalar triple product u - (v x w) of the vectors 

u = 3i - 2j - 5k, v = i + 4j - 4k 



w = 3 j + 2k 



Solution 

From 7, 



u - (v x w) = 



3 -2 


-5 










1 4 -4 




3 2 




4 -4 
3 

3 2 


-( 


-2) 


1 -4 
2 


+ C-5) 


1 4 
3 



= 3 



= 60 + 4-15 = 49 



Remark The symbol ( u . v ) x w makes no sense because we cannot form the cross product of a scalar and a vector. Thus no 
ambiguity arises if we write u . v x w rather than u - (v x w). However, for clarity we shall usually keep the parentheses. 

It follows from 7 that 

u - (v x w) = w - (u x v) = v - (w X u) 

since the 3 x 3 determinants that represent these products can be obtained from one another by two row interchanges. (Verify.) 
These relationships can be remembered by moving the vectors w, v, and w clockwise around the vertices of the triangle in Figure 
3.4.6. 




Figure 3.4.6 

Geometric Interpretation of Determinants 

The next theorem provides a useful geometric interpretation of 2 x 2 and 3x3 determinants. 



THEOREM 3.4.4 



(a) The absolute value of the determinant 



det 



vi v 2 



is equal to the area of the parallelogram in 2-space determined by the vectors u = (u\,u2) ancl 
v=(vi,v2). (See Figure 3.4.7a.) 



W" 2 i »i) 





(a) 






m 



Figure 3.4.7 



(b) The absolute value of the determinant 




is equal to the volume of the parallelepiped in 3-space determined by the vectors u= (wi,u2>"3)/ 
v=(vi,v 2 ,v 3 ), and w=(wi,w 2 ,w3)- ( See Figure 3.4.76.) 



Proof fa) The key to the proof is to use Theorem 3.4.3. However, that theorem applies to vectors in 3-space, whereas u={u\, uj) 
and v = (vi, v 2 ) are vectors in 2-space. To circumvent this "dimension problem," we shall view u and v as vectors in the xy -plane 
of an xyz-coordinate system (Figure 3.4.8a), in which case these vectors are expressed as u = (u\, uj 9 0) an d v = (v\, v 2 , 0)- Thus 



U X V = 



i J k 




u\ U2 






\ u \ u 2~\ 


U\ U2 


= 




k = 


dRt 






vi v 2 






vi v 2 


vi v 2 














(a) 



»l.%0) 



\ * w 



/r^||proj v ^ H u|| 




(*) 



Figure 3.4.8 

It now follows from Theorem 3.4.3 and the fact that ||k|| = 1 that the area A of the parallelogram determined by u and v is 



A= lluxvll = lldet 



r u i u 2i 






r u i ^2i 






|~ u l u 2l 




kn = 


det 




l|k|| = 


del: 




[vi v 2 J 






[vi v 2 J 






[vi v 2 J 



which completes the proof. 



Proof (b) As shown in Figure 3.4.8/?, take the base of the parallelepiped determined by w, v, and w to be the parallelogram 
determined by v and w. It follows from Theorem 3.4.3 that the area of the base is ||v x w|| and, as illustrated in Figure 3.4.8/?, the 
height h of the parallelepiped is the length of the orthogonal projection of won vx w Therefore, by Formula 10 of Section 3.3, 



h = |]proj vxw u|| = 



|n - (y x w) | 
llvx wll 



It follows that the volume V of the parallelepiped is 

V = (area of base) -height = llvx wll-'— n — - n - ^ = |u- (vxw)| 

||vxw|| I I 

so from 7, 



V = 



det 



"1 


"2 


"3] 


vi 


^2 


v 3 


W\ 


M?2 


M?3 



which completes the proof. 



Remark If V denotes the volume of the parallelepiped determined by vectors w, v, and h>, then it follows from Theorem 3.3 and 
Formula 7 that 



V = 



volume of parallelepiped 
determined by \\, \\ andw 



= In - (y x w) I 



(8) 



From this and Theorem 3.3.1 /?, we can conclude that 

u - (y x w) = zb V 

where the + or - results depending on whether u makes an acute or an obtuse angle with v x w- 

Formula 8 leads to a useful test for ascertaining whether three given vectors lie in the same plane. Since three vectors not in the 
same plane determine a parallelepiped of positive volume, it follows from 8 that |u - (v x w) | = if and only if the vectors w, v, and 
w lie in the same plane. Thus we have the following result. 



THEOREM 3.4.5 



If the vectors u = ( Uu U2? U3 ), v 
plane if and only if 


= (vi,v 2 , v 3 ), I 








ind w = (wi, W2, W3) have the same initial point, then they lie in the same 










u - (y x w) = 


u\ u 2 ui 
vi v 2 v 3 
w\ W2 W3 


= 











Independence of Cross Product and Coordinates 

Initially, we defined a vector to be a directed line segment or arrow in 2- space or 3 -space; coordinate systems and components 
were introduced later in order to simplify computations with vectors. Thus, a vector has a "mathematical existence" regardless of 



whether a coordinate system has been introduced. Further, the components of a vector are not determined by the vector alone; they 
depend as well on the coordinate system chosen. For example, in Figure 3.4.9 we have indicated a fixed vector v in the plane and 
two different coordinate systems. In the ^-coordinate system the components of v are (1, 1), and in the x*y* -system they are 

Cl/2,0)- 

.? 

i\T r O} = (x\y*) 

*- 




Figure 3.4.9 

This raises an important question about our definition of cross product. Since we defined the cross product u x v in terms of the 
components of u and v, and since these components depend on the coordinate system chosen, it seems possible that two fixed 
vectors u and v might have different cross products in different coordinate systems. Fortunately, this is not the case. To see that this 
is so, we need only recall that 

* 
u x v is perpendicular to both u and v. 



The orientation of u x v is determined by the right-hand rule. 



||uxv|| = ||u||||v||sinft 



These three properties completely determine the vector u x v- the first and second properties determine the direction, and the third 
property determines the length. Since these properties of u x v depend only on the lengths and relative positions of u and v and not 
on the particular right-hand coordinate system being used, the vector u x y will remain unchanged if a different right-hand 
coordinate system is introduced. We say that the definition of u x v is coordinate free . This result is of importance to physicists 
and engineers who often work with many coordinate systems in the same problem. 



EXAMPLE 6 u x V 's Independent of the Coordinate System 



Consider two perpendicular vectors u and v, each of length 1 (Figure 3.4.10a). If we introduce an xyz-coordinate system as shown 
in Figure 3.4. 10/?, then 

u= (1, 0, 0) =i and v= (0, 1, 0) = j 

so that 

uxv = ix] = k= (0, 0, 1) 
However, if we introduce an j'jV-coordinate system as shown in Figure 3.4.10c, then 

u= (0, 0, 1) =k and v= (1, 0, 0) =i 
so that 

uxv = kxi = j=(0, 1,0) 



But it is clear from Figures 3.4. 10Z? and 3.4.10c that the vector (0, 0, 1) in the xyz- system is the same as the vector (0, 1, 0) in the 



j^jj 



xyz -system. Thus we obtain the same vector u x v whether we compute with coordinates from the xyz-system or with 



kjj 



coordinates from the x y z -system. 
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Figure 3.4.10 



Exercise Set 3.4 



O 



Click here for Just Ask! 



Let u =(3,2, -l),v=(0, 2, -3), and w = (2, 6, !)■ Compute 

(a) vxw 

(b) u x (v x w) 

(c) (u x v) x w 

(d) (u x v) x (v x w) 

(e) u x (v - 2w) 

(f) (uxv)-2w 

Find a vector that is orthogonal to both u and v. 

(a) u=(-6,4,2),v=(3, 1,5) 



(b) u=(-2, l,5),v=(3,0, -3) 



Find the area of the parallelogram determined by u and 



(a) u=(l, -l,2),v=(0,3, 1) 



5. 



6. 



7. 



(b) u= (2, 3, 0),v=(-l,2, -2) 

(c) u=(3, -l,4),v=(6, -2,8) 

Find the area of the triangle having vertices P, Q, and R. 
4. 

(a) P(2,6, -1), (2(1,1. 1), 5(4, 6. 2) 

(b) P(l, -1,2), (2(0, 3. 4), 5(6, 1.8) 
Verify parts (a), (&), and (c) of Theorem 3.4.1 for the vectors u = (4, 2, 1) and v = ( — 3, 2, 7)- 
Verify parts (a), (b), and (c) of Theorem 3.4.2 for u = (5, - 1, 2), v = (6, 0, - 2), and w = (1, 2, - 1). 
Find a vector v that is orthogonal to the vector u = (2, — 3,5)- 

Find the scalar triple product u . ( v x w)- 
8. 

(a) u=(-l,2,4),v=(3,4, - 2), w= ( - 1, 2, 5) 

(b) u =(3, -l,6),v=(2,4,3),w=(5, -1,2) 

Suppose that u • (v x w) = 3- Find 
9. 

(a) u - (wxv) 

(b) (vxw) - u 

(c) w • (u x v) 

(d) v - (u x w) 

(e) (uxw) ■ v 

(f) v - (wxw) 



12. 



Find the volume of the parallelepiped with sides u, v, and w. 
10. 

(a) u=(2, -6, 2),v=(0,4, -2),w=(2, 2, -4)) 

(b) u =(3, l,2),v=(4,5, l),w=(l,2,4) 

Determine whether «, v, and w lie in the same plane when positioned so that their initial points coincide. 
11. 

(a) u =(-l, -2, l),v=(3,0. -2),w=(5, -4,0) 

(b) u=(5, -2, l), v =(4, -1, l),w=(l, -1,0) 

(c) u=(4, -8, l),v=(2, 1, -2),w=(3, -4,12) 
Find all unit vectors parallel to the yz-plane that are perpendicular to the vector (3, _ ] 2)- 

Find all unit vectors in the plane determined by u =(3,0, 1) and v = (1, — 1,1) that are perpendicular to the vector 
13 ' w= (1,2,0). 

Let a = (ai,a 2 ,a 3 ),h = {b\,b 2 ,b 3 ), f = Oi, c 2 , c 3 ), and d = {d\, d 2 ,d 3 )- Show that 
14 ' (a + d) ■ (b x c) = a - (b x c) I d - (b x c) 

Simplify ( u | v ) x (u-v)- 

Use the cross product to find the sine of the angle between the vectors u == (2, 3, — 6) and v = (2, 3, 6)- 
16. 



17. 

(a) Find the area of the triangle having vertices j4(1, 0, 1)» 5(0, 2, 3)? an d C(2, 1, 0)- 



(b) Use the result of part (a) to find the length of the altitude from vertex C to side AB. 



Show that if u is a vector from any point on a line to a point P not on the line, and v is a vector parallel to the line, then the 
18. distance between P and the line is given by ||u x v|| / ||v||. 

Use the result of Exercise 18 to find the distance between the point P and the line through the points A and 
19. 

(a) />(-3,l,2M(U,0),5(-2,3, -4) 



20. 



(b) P(4, 3, 0), ,4(2, 1, - 3), 5(0, 2, - 1) 
Prove: If fl is the angle between u and v and u v ± 0. then tan = ||u x v|| / (u - v)« 

Consider the parallelepiped with sides u=(3,2, 1), v=(l,l,2), an d w= (1, 3, 3). 
21. 

(a) Find the area of the face determined by w and n>. 

(b) Find the angle between u and the plane containing the face determined by v and w. 

Note The angle between a vector and a plane is defined to be the complement of the angle 9 between the vector and 
that normal to the plane for which < 9 < tt / 2. 

Find a vector n that is perpendicular to the plane determined by the points A(0, — 2, 1)> 5(1, — 1, — 2), and C( — 1, 1, 0)- 
22- [See the note in Exercise 21.] 

Let m and n be vectors whose components in the xyz-system of Figure 3.4. 10 are m = (0 1 ) and n = (0 1 0)- 
23. 

(a) Find the components of m and n in the x^y'^-system of Figure 3.4.10. 

(b) Compute m x n using the components in the xyz-system. 

(c) Compute m x n using the components in the x*y V-system. 

(d) Show that the vectors obtained in (b) and (c) are the same. 

Prove the following identities. 
24. 

(a) (u f kv) xv = uxv 

(b) u - (v x z) = — (u x z) - v 

Let w, v, and w be nonzero vectors in 3-space with the same initial point, but such that no two of them are collinear. Show that 
25. 

(a) u x (v x w) lies in the plane determined by v and w 

(b) (u x v) x w lies in the plane determined by u and v 

Prove part (d) of Theorem 3.4.1. 
26. 



Hint First prove the result in the case where w = i = (0, 0,1) then when w = j = (0, 1,0), and then when w = k = (0, 0, 1). 
Finally, prove it for an arbitrary vector w = ( w 1 ? >^2, W3) by writing w = w ^ 4. ^j + vi^k- 

Prove part (e) of Theorem 3.4.1. 
27. 

Z/m£ Apply part (a) of Theorem 3.4.2 to the result in part (d) of Theorem 3.4.1. 

Let u = (1, 3, — l)»v=(l, 1,2)» an d w = (3, —1,2)- Calculate u x (v x w) using the technique of Exercise 26; then check 
28- your result by calculating directly. 

Prove: If a, b, c, and d lie in the same plane, then ( a x h) x (c x cl) = 0- 
29. 

It is a theorem of solid geometry that the volume of a tetrahedron is i (area of base) - (height). Use this result to prove that the 
' volume of a tetrahedron whose sides are the vectors a, b, and c is ^-la - (b x c) I (see the accompanying figure). 




Figure Ex-30 

Use the result of Exercise 30 to find the volume of the tetrahedron with vertices P, Q, R, S. 
31. 

(a) P( - 1, 2, 0), Q{2, 1. - 3), 5(1, 0, 1), S(3, - 2, 3) 

(b) P(0, 0, 0), (2(1, 2, - 1), 5(3, 4, 0), £( - 1, - 3, 4) 

Prove part (b) of Theorem 3.4.2. 
32. 



Prove parts (c) and (d) of Theorem 3.4.2. 
33. 



Prove parts (e) and (f) of Theorem 3.4.2. 
34. 



Discussion 
Discovery 



35. 

(a) Suppose that u and v are noncollinear vectors with their initial points at the origin in 3-spact 
Make a sketch that illustrates how w = v x (u x v) is oriented in relation to u and v. 



(b) For w as in part (a), what can you say about the values of u . w and v . w ? Explain your 
reasoning. 



If u ^ 0. is it valid to cancel u from both sides of the equation u x v = u x w and conclude that v = w 
36. ? Explain your reasoning. 

Something is wrong with one of the following expressions. Which one is it and what is wrong? 

u-(vxw), U X v x w, (u - v) X w 

What can you say about the vectors u and v if u x v = 0? 
38. 



Give some examples of algebraic rules that hold for multiplication of real numbers but not for the 
39. cross product of vectors. 
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3.5 

LINES AND PLANES IN 
3-SPACE 



In this section we shall use vectors to derive equations of lines and planes in 
3-space. We shall then use these equations to solve some basic geometric 
problems. 



Planes in 3-Space 

In analytic geometry a line in 2-space can be specified by giving its slope and one of its points. Similarly, one can specify a 
plane in 3-space by giving its inclination and specifying one of its points. A convenient method for describing the inclination of 
a plane is to specify a nonzero vector, called a normal, that is perpendicular to the plane. 

Suppose that we want to find the equation of the plane passing through the point Pq(xq, yt\> z o) anc * having the nonzero vector 
n = (a, h, c) as a normal. It is evident from Figure 3.5.1 that the plane consists precisely of those points p(x ? y, z) for which the 
vector p~p is orthogonal to n\ that is, 



Since pZp _ ( x _ XQ y _ y Q z _ ZQ ), Equation 1 can be written as 

a(x-x\]) I b(y-y\]) I c(z-zq) = 
We call this the point-normal form of the equation of a plane. 



Figure 3.5.1 




(1) 



(2) 



Plane with normal vector. 



EXAMPLE 1 Finding the Point-Normal Equation of a Plane 



Find an equation of the plane passing through the point (3 ? —1,7) and perpendicular to the vector n = (4, 2, — 5)- 



Solution 

From 2 a point-normal form is 



4(*-3) I 2(y I l)-5(z-7) = 



By multiplying out and collecting terms, we can rewrite 2 in the form 

ax I -by I cz I d = 
where a, b, c, and d are constants, and a, fo, and c are not all zero. For example, the equation in Example 1 can be rewritten as 

4x + 2y - 5z I 25 = 

As the next theorem shows, planes in 3-space are represented by equations of the form a x I by I cz 4- d = • 
THEOREM 3.5.1 



If a, b, c, and d are constants and a, b, and c are not all zero, then the graph of the equation 




ax + by + cz + d = 


(3) 


is a plane having the vector n = {a,b,c) as a normal. 





Equation 3 is a linear equation in jc, y, and z; it is called the general form of the equation of a plane. 

Proof By hypothesis, the coefficients a, 6, and c are not all zero. Assume, for the moment, that a * 0- Then the equation 
flX + £y I cz | ^ — o can be rewritten in the form a (x I (d i a)) I by I cz = 0- But this is a point-normal form of the plane 
passing through the point ( — dfa,0,0) and having n — (#, h, c) as a normal. 

If fl == 0, then either £ * : Q or c ± : Q. A straightforward modification of the above argument will handle these other cases. 

I 

Just as the solutions of a system of linear equations 

ax + by = k\ 
ex \ dy = k2 
correspond to points of intersection of the lines a x I by = k\ and ex + dy = £2 i n the xy -plane, so the solutions of a system 

ax +by -\-cz = k\ 

dx+ey-\-fz = k 2 (4) 

gx + ky + iz = £3 

correspond to the points of intersection of the three planes a x \-by + cz = k\, dx f ey + fz = k 2 > and gx + ky -\-iz = k^ 
In Figure 3.5.2 we have illustrated the geometric possibilities that occur when 4 has zero, one, or infinitely many solutions. 




Figure 3.5.2 
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(a) No solutions (3 parallel planes), (b) No solutions (2 parallel planes), (c) No solutions (3 planes with no 
common intersection), (d) Infinitely many solutions (3 coincident planes), (e) Infinitely many solutions (3 planes 
intersecting in a line). (/) One solution (3 planes intersecting at a point), (g) No solutions (2 coincident planes 
parallel to a third plane), (h) In.nitely many solutions (2 coincident planes intersecting a third plane). 



EXAMPLE 2 Equation of a Plane Through Three Points 



Find the equation of the plane passing through the points J P 1 (1, 2, — 1)» -P2(2, 3, 1)» anc * ^(3, — 1,2)- 



Solution 

Since the three points lie in the plane, their coordinates must satisfy the general equation a x + by + cz + d = of the plane. 
Thus 

a + 2b - c I d = 
2a \ 3b+ c + d = 
3a- b + 2c + d = 

Solving this system gives a = — j-t, b = — j-t, c = j-t, d = t> Letting t — _ 16, for example, yields the desired equation 

9k \-y-5z- 16 = 

We note that any other choice of t gives a multiple of this equation, so that any value of t ^ would also give a valid equation of 
the plane. 

Alternative Solution 



Since the points ^(1,2, —\),P2{2,3,\), and ^(3, —1,2) li e i n the plane, the vectors p7p~ _ n \ 2) anc ^ 
pTpt — (2 — 3 3) are parallel to the plane. Therefore, the equation j p7pl x j p7p^ = (9 1 — 5) ^ s norma l to the plane, since it 
is perpendicular to both p~p and p~p . From this and the fact that j p 1 lies in the plane, a point-normal form for the equation of 
the plane is 



9(k-1) i (y-2)-5(z I 1) = or 9k I y - 5z- 16 = 



Vector Form of Equation of a Plane 

Vector notation provides a useful alternative way of writing the point-normal form of the equation of a plane: Referring to 
Figure 3.5.3, let r = (x 9 y, z) be the vector from the origin to the point p(x, y, z), let r = (xq 9 y^ zq) be the vector from the 
origin to the point Pq(xq, j/g, zg), anc * ^ et n — ( a > &, c) be a vector normal to the plane. Then j p7^ _ r _ rQ , so Formula 1 can be 
rewritten as 



n-(r-r ) = 
This is called the vector form of the equation of a plane. 



(5) 




EXAMPLE 3 Vector Equation of a Plane Using 5 



The equation 

(-l,2,5)-(*-6, L y-3,z + 4)=0 
is the vector equation of the plane that passes through the point (6, 3, — 4) an d is perpendicular to the vector n=(— 1,2,5)- 

Lines in 3-Space 

We shall now show how to obtain equations for lines in 3-space. Suppose that / is the line in 3-space through the point 
Pq(xq, y$, zq) an d parallel to the nonzero vector v — ( a? b, c)> It is clear (Figure 3.5.4) that / consists precisely of those points 
P(x, y, z) for which the vector pZp is parallel to v — that is, for which there is a scalar t such that 



P^P=tv 

In terms of components, (6) can be written as 

(x-XQ 7 y-yQ 7 z-ZQ) = (£a 7 £b,tc) 

from which it follows that x _ * = ^, y - y = tb, and z _ ZQ = £ C , so 

x = XQ+£a r y=y Q -\-£b ? z=zq-\-£c 
As the parameter t varies from — x to + do , the point P{x,y,z) traces out the line /. The equations 

x = x$ + la, y =yt} + £b, z = zq + £c (— oo <£< + do) 



(6) 



(7) 



are called parametric equations for /. 



m&4 ' 




Figure 3.5.4 



p p is parallel to v. 



EXAMPLE 4 Parametric Equations of a Line 



The line through the point (1,2, — 3) an d parallel to the vector v = (4, 5, — 7) has parametric equations 

*=l+4^ y = 2-\-5t, z=-3-l£ (-do<£< + oq) 



EXAMPLE 5 Intersection of a Line and the xy-Plane 



(a) Find parametric equations for the line / passing through the points /^ (2, 4, — 1) and ^(5, 0, 7)- 



(b) Where does the line intersect the xy-plane? 



Solution (a) 



Since the vector p7p~ _ / 3 _ 4 g") is parallel to / and j p 1 (2, 4, - 

?: = 2-\-3t, ^ = 4-4^ z = 



1 ) lies on /, the line / is given by 

-1+8* ( - DO < £ < + DO ) 



Solution (b) 



_ 1 



The line intersects the xy-plane at the point where z= — 1 + 3* = 0, that is, where £ = ± Substituting this value of t in the 

s 

parametric equations for / yields, as the point of intersection, 



EXAMPLE 6 Line of Intersection of Two Planes 



Find parametric equations for the line of intersection of the planes 

3* + 2y - 4z - 6 = and x-3y-2z-4 = 



Solution 

The line of intersection consists of all points (x,y,z) that satisfy the two equations in the system 

3x-\ 2y-4z = 6 
x-3y-2z = 4 

Solving this system by Gaussian elimination gives x = yy- 4- yy£, y = — y — jrt, z = t- Therefore, the line of intersection can be 
represented by the parametric equations 



Vector Form of Equation of a Line 

Vector notation provides a useful alternative way of writing the parametric equations of a line: Referring to Figure 3.5.5, let 
r = (x, y, z) be the vector from the origin to the point p(x, y, z), let r = (^ Qj ^q, Z q) be the vector from the origin to the point 
A)(*0> yt\> z o)' anc * l et v = (a 9 b, c) be a vector parallel to the line. Then pTp =l _ rQ , so Formula 6 can be rewritten as 

r - ro = tv 
Taking into account the range of ^-values, this can be rewritten as 



r = rg — tv ( — oo < £ < + oo ) 
This is called the vector form of the equation of a line in 3-space. 



(8) 
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Figure 3.5.5 

Vector interpretation of a line in 3-space. 



EXAMPLE 7 A Line Parallel to a Given Vector 



The equation 

(x,y,z) = (-2,0, 3) I t(4, -7, 1) (- do <£< + do) 
is the vector equation of the line through the point ( _ 2, 0, 3) that is parallel to the vector v = (4, — 7, !)• 



Problems Involving Distance 

We conclude this section by discussing two basic "distance problems" in 3-space: 



Problems 




(a) 


Find the distance between 


a point and a plane. 


(b) 


Find the distance between two parallel planes. 



The two problems are related. If we can find the distance between a point and a plane, then we can find the distance between 
parallel planes by computing the distance between either one of the planes and an arbitrary point p^ in the other (Figure 3.5.6). 



i 



u j 



Figure 3.5.6 



The distance between the parallel planes V and W is equal to the distance between p^ and W. 



THEOREM 3.5.2 



Distance Between a Point and a Plane 

The distance D between a point Pq (*0> 7 0> z o) an ^ t ^ ie P^ ane ax I by + cz + d = is 



D = 



|fljrp + by$ + czq +d\ 
ja 2 + b 2 +c 2 



(9) 



Proof Let Q(xi r yi,zi) be any point in the plane. Position the normal n = ( fl? b, c) so that its initial point is at Q. As illustrated 
in Figure 3.5.7, the distance D is equal to the length of the orthogonal projection of npl on n. Thus, from (10) of Section 3.3, 



£=llpro Jn £Poll = 



2^0-n 



ii 



But 



qPq -n = flOo-*l) I iOo-^i) I cOo-*i) 



Wl = \a 2 -\-b 2 -\-c 2 



Thus 



D = 



|fl(jr -JTi) I b(yn-yi) I c(z -zi)| 
l/a 2 l-6 2 + c 2 



(10) 



Since the point 601,71,21) lies in the plane, its coordinates satisfy the equation of the plane; thus 

ax\ -\-by\ +cz\ -\-d = 

or 

d = — ax 1 — by 1 — cz\ 

Substituting this expression in (10) yields (9). 
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Figure 3.5.7 



Distance from Pg to plane. 



Remark Note the similarity between (9) and the formula for the distance between a point and a line in 2-space [13 of Section 
3.3]. 



EXAMPLE 8 Distance Between a Point and a Plane 



Find the distance D between the point (l ? _ 4 ? _ 3) and the plane 2* — 3^ | 8z = — 1- 



Solution 

To apply (9), we first rewrite the equation of the plane in the form 

2x - 3y 4- 6z 4- 1 = 
Then 



^_ |2(1) I (-3)(-4) I 6(-3) + l| 
t/2 2 I (-3) 2 + 6 2 



1-31 



3 
7 



Given two planes, either they intersect, in which case we can ask for their line of intersection, as in Example 6, or they are 
parallel, in which case we can ask for the distance between them. The following example illustrates the latter problem. 



EXAMPLE 9 Distance Between Parallel Planes 



The planes 

x + 2y-2z = 3 and 2* I 4y-4z = 7 

are parallel since their normals, (1,2, — 2) and (2 ? 4, — 4), are parallel vectors. Find the distance between these planes. 

Solution 

To find the distance D between the planes, we may select an arbitrary point in one of the planes and compute its distance to the 
other plane. By setting y = z = in the equation x I 2^ — 2z = 3, we obtain the point p$ (3 ? 0, 0) in this plane. From (9), the 
distance between p^ and the plane 2x | 4^ — Az = 7 is 

D _ [2(3) I 4(0) I (-4)(0)-7| _i 

l /2 2 + 4 2 I (-4) 2 6 



Exercise Set 3.5 



Click here for Just Ask! 



Find a point-normal form of the equation of the plane passing through P and having n as a normal. 
1. 

(a) P(-l,3, -2); n= (-2, 1, -1) 

(b) P(l, 1,4); n= (1,9, 8) 

(c) P(2, 0,0); n= (0,0, 2) 

(d) P(0, 0,0); n= (1,2, 3) 

Write the equations of the planes in Exercise 1 in general form. 

2. 

Find a point-normal form of the equations of the following planes. 
3. 

(a) -3* + 7y + 2z=10 

(b) x-4z=0 



Find an equation for the plane passing through the given points. 
4. 

(a) P(-4, -1, -1), £(-2, 0, \),R(-1, -2,-3) 

(b) P(5, 4, 3), £(4,3, 1),*(1, 5, 4) 

Determine whether the planes are parallel. 

5. 

(a) 4x-y I 2z = 5and7x-3.y + 4z = 8 

(b) x _ 4y - 37 - 2 = and 3x - \2y - 9z - 7 = 

(c) 2y = 2x-4z I 5 and x = \z + \y 

Determine whether the line and plane are parallel. 
6. 

(a) x=-5-4*,y = l-i,z=3 I 2f,x I 2y I 3z - 9 = 

(b) x = 3t,y=\+2t,z=2-t\4x-y I 2z=l 

Determine whether the planes are perpendicular. 
7. 

(a) Zx-y I z-4 = 0,x + 2z = -1 

(b) ^ - 2y i 3z = 4, - 2x V 5y 4- 4z = - 1 

Determine whether the line and plane are perpendicular. 
8. 

(a) x = -2-4t,y = 3-2t,z=l + 2r,2x+y-z = 5 

(b) T = 9 -i- f , y = 1 - 1, 7 = ^ -i 3f ; 6x | 6y-7 = 

Find parametric equations for the line passing through P and parallel to 
9. 

(a) P(3, -1,2); n= (2, 1,3) 

(b) P(-2, 3, -3); n= (6, -6,-2) 



(c) P(2, 2, 6); n= (0,1,0) 

(d) P(0, 0, 0);» = (1, -2,3) 

Find parametric equations for the line passing through the given points. 
10. 

(a) (5, -2,4), (7,2, -4) 

(b) (0,0,0), (2, -1,-3) 

Find parametric equations for the line of intersection of the given planes. 
11. 

(a) lx - 2y f 3z = - 2 and _ 3 X + y + 2z + 5 = 

(b) 2x I 3 i y-5z = 0andy = 

Find the vector form of the equation of the plane that passes through pr. and has normal n. 
12. 

(a) Pot -1,2, 4); n= (-2, 4,1) 

(b) p (2.0, -5);n=(-l,4,3) 

(c) P (5. -2, l);n=(- 1,0,0) 

(d) P o (0, 0,0); n= (a, b,c) 

Determine whether the planes are parallel. 
13. 

(a) (_l,2,4) - (x-5,y I 3,z-7) = 0; (2, -4, - 8) - (x -\ 3,y I 5,z-9)=0 

(b) (3,0, -1) -(xH l,y-2,z-3) = 0; (-1,0,3) - (x I \,y-z,z-3) = 



Determine whether the planes are perpendicular. 
14. 

(a) (-2,1,4) -{x-\,y,z+3)=0;{\, -2,1) - (x I 3,^-5,z)=0 



(b) (3,0, -2) -(* I 4,y-l,z I 1) = 0; (1,1,1) -(x,y,z) = 

Find the vector form of the equation of the line through p n and parallel to v. 
15. 

(a) P (-l,2, 3);v=(7, -1,5) 

(b) P (2, 0, -l);v= (1,1,1) 

(c) P (2, -4, l); v =(0,0, -2) 

(d) P u (0, 0,0);y= (a, b,c) 



Show that the line 

16 - * = 0, y=£, z = t (-oo<£< + oo) 



17. 



(a) lies in the plane Sx \ 4y _ 4 Z = 

(b) is parallel to and below the plane 5x — 3y h 3z = 1 

(c) is parallel to and above the plane 6x -\- 2y — 2z = 3 

Find an equation for the plane through ( — 2, 1,7) that is perpendicular to the line x — 4 = 2t> y + 2 = 3t, z = —5t- 



Find an equation of 
18. 

(a) the xy-plane 

(b) the ^ z -plane 

(c) the ys-plane 

Find an equation of the plane that contains the point (*o, .y o> ^o) anc * * s 
(a) parallel to the xy-plane 



20. 



(b) parallel to the ys-plane 

(c) parallel to the ^-plane 

Find an equation for the plane that passes through the origin and is parallel to the plane 7* | 4y _ 2z I 3 = 0- 



Find an equation for the plane that passes through the point (3 ? —6,1) and is parallel to the plane 5* — 2^ I z — 5 = 0- 
21. 

Find the point of intersection of the line 

22- * - 9 = - 5*, ^ + 1=-*, z-3 = t (-oo<£< + oo) 

and the plane 2x-3y I 4z + 7 = 0- 

Find an equation for the plane that contains the line x = — 1 I 3t, y = 5 I 2t,z = 2 — £ and is perpendicular to the plane 
23 « 2x-4y + 2z = 9- 

Find an equation for the plane that passes through (2, 4, — 1) and contains the line of intersection of the planes 
24- x -y - 4z = 2 and _ 2x 4- y 4- 2z = 3- 

Show that the points ( — 1, — 2, — 3), ( — 2, 0, 1), ( — 4, — 1, — 1)> and (2, 0, 1) lie i n the same plane. 
25. 

Find parametric equations for the line through ( _ 2, 5, 0) that is parallel to the planes 2* | y — 4z = and 
26 - - x +. 2y + 3z+\ = 0- 

Find an equation for the plane through ( — 2, 1, 5) that is perpendicular to the planes 4x — 2y + 2z = — 1 and 
27 " 3x + 3y-6z = 5- 

Find an equation for the plane through (2 ? —1,4) that is perpendicular to the line of intersection of the planes 

28 « 4* 1 2^ + 2z = - 1 and 3* | 6y I 3z = 7- 

Find an equation for the plane that is perpendicular to the plane 3* _ 2^ | &z=\ and passes through the points 
29 " Pi(-1,2, 5)andp 2 (2, 1,4). 

Show that the lines 
30# x = 3-2t ? ^ = 4 + ^, z=\-t (-oo<£< + oo) 

and 

X = 5 + 2t, y=\— t r Z=l+£ (— DQ<£< + DCi) 

are parallel, and find an equation for the plane they determine. 

Find an equation for the plane that contains the point (1 — 1 2) and the line x = t 9 y = t + hz= — 3 + 2t- 
31. 

Find an equation for the plane that contains the line x = 1 4- t> y = 3& z = 2t and is parallel to the line of intersection of the 
32 - planes - x | 2^ I z = and* | z I 1 = 0. 



Find an equation for the plane, each of whose points is equidistant from ( _ 1 _ 4 _ 2) and (0 — 2 2). 
33. 



Show that the line 

34- x - 5 = - 1, y I 3 = 2t, z-\- \= -5t ( - 00 < t < + 00 ) 

is parallel to the plane —3 x -+y-\-z — 9 = 0- 

Show that the lines 

35- *-3=4*, y-4 = £, z-l = (-og<*< + dq) 
and 

jt + 1 = 12i, y-l = 6t, z-5 = 3t (-oo<£< + cc) 
intersect, and find the point of intersection. 

Find an equation for the plane containing the lines in Exercise 35. 
36. 

Find parametric equations for the line of intersection of the planes 
37. 

(a) - 3* I 2^ 4- z = - 5 and 7* | 3y - 2z = - 2 

(b) 5*-7 i y + 2z = 0and y = 
Show that the plane whose intercepts with the coordinate axes are x = fl , y = £, and z = c has equation 



*+Z+£ =1 



38. 

a ' b ' c 
provided that a, b, and c are nonzero. 



Find the distance between the point and the plane. 
39. 

(a) (3,1, -2);x I 2y-2z = 4 

(b) (-1,2,1); 2* I 3y-Az=\ 

(c) (0,3, -2);x-y-z = 3 

Find the distance between the given parallel planes. 
40. 

(a) 3x - Ay I z = 1 and 6x - By + 2z = 3 

(b) -Ax I y - 3z = and Bx-2y I 6z = 

(c) 2x-y+z=\ and 2x-y + z = -1 



Find the distance between the line x = 3£ — \, y = 2 — £, z = £ and each of the following points. 
41. 

(a) (0,0,0) 

(b) (2, 0, - 5) 

(c) (2,1,1) 

Show that if a, b, and c are nonzero, then the line 

42 

' x =xq-\- a£ 7 y =y\}-\- b£ 7 z=zq+c£ (— do <£ < -\- oq) 

consists of all points (x,y,z) that satisfy 

a b c 

These are called symmetric equations for the line. 

Find symmetric equations for the lines in parts (a) and (b) of Exercise 9. 
43. 

Note See Exercise 42 for terminology. 

In each part, find equations for two planes whose intersection is the given line. 
44. 

(a) x = l-4t,y= -5-2t,z=5 + t ( - do < £ < + co ) 

(b) x = Au y = 2U z = l£ ( - do <£< + oo ) 

Hint Each equality in the symmetric equations of a line represents a plane containing the line. See Exercise 42 for 
terminology. 

Two intersecting planes in 3-space determine two angles of intersection: an acute angle (0 < 9 < 90°) and its supplement 
45. ]gQo _ q ( see the accompanying figure). If nj and n 2 are nonzero normals to the planes, then the angle between ni and n 2 

9 or 180° — 0, depending on the directions of the normals (see the accompanying figure). In each part, find the acute angk 
of intersection of the planes to the nearest degree. 

(a) x = 0^d2x-y |z-4 = 

(b) x I 2^ - 2z = 5 and 6* - 3^ + 2z = 3 




Murn! J 
IfKT-d 



Plane 2 



Figure Ex-45 

Note A calculator is needed. 

Find the acute angle between the plane x —y — 3z = 5 and the line x = 2 — 1> y = 2t, z = 3t — 1 to the nearest degree. 
46. 

Hint See Exercise 45. 



Discussion 

DisOOVerV What do the lines r — rQ 4. i v and r — rQ _ i v have in common? Explain. 



What is the relationship between the line x = x$ + at> y = y$ + b& z = zq + eft an d the plane 
48- ax I by-\-cz = 0l Explain your reasoning. 

Let r\ and r 2 be vectors from the origin to the points P\(xi,yi 9 zi) an d P 2 (*2> y 2> z 2)> 
^' respectively. What does the equation 

r=(l -£)t\ +£T2 (0<*<1) 
represent geometrically? Explain your reasoning. 

Write parametric equations for two perpendicular lines through the point (xn yn zn)« 
50. 



How can you tell whether the line x — XQ 4. t v in 3-space is parallel to the plane 
51 ' x = x + ^vi+^v 2 ? 

Indicate whether the statement is true (T) or false (F). Justify your answer. 
52. 

(a) If a, b, and c are not all zero, then the line x = at, y = bt, z = ct is perpendicular to the 
plane ax \-by+cz = 0> 



(b) Two nonparallel lines in 3-space must intersect in at least one point. 



(c) If w, v, and w are vectors in 3-space such that u + v + w = 0, then the three vectors lie in 
some plane. 



(d) The equation x = tv represents a line for every vector v in 2-space. 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



Abbreviations 

GPS Global Positioning System 
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Chapter 3 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Section 3.1 

Tl. (Vectors) Read your documentation on how to enter vectors and how to add, subtract, and multiply them by scalars. Then 
perform the computations in Example 1 . 



T2. (Drawing Vectors) If you are using a technology utility that can draw line segments in two or three-dimensional space, try 
drawing some line segments with initial and terminal points of your choice. You may also want to see if your utility allows 
you to create arrowheads, in which case you can make your line segments look like geometric vectors. 

Section 3.3 

Tl. (Dot Product and Norm) Some technology utilities provide commands for calculating dot products and norms, whereas 
others provide only a command for the dot product. In the latter case, norms can be computed from the formula ||v|| = ^v - v- 
Read your documentation on how to find dot products (and norms, if available), and then perform the computations in 
Example 2. 



T2. (Projections) See if you can program your utility to calculate p r0 j when the user enters the vectors a and u. Check your 
work by having your program perform the computations in Example 6. 

Section 3.4 

Tl. (Cross Product) Read your documentation on how to find cross products, and then perform the computation in Example 1. 

T2. (Cross Product Formula) If you are working with a CAS, use it to confirm Formula 1. 

T3. (Cross Product Properties) If you are working with a CAS, use it to prove the results in Theorem 3.4.1. 



T4. (Area of a Triangle) See if you can program your technology utility to find the area of the triangle in 3-space determined by 
three points when the user enters their coordinates. Check your work by calculating the area of the triangle in Example 4. 



T5. (Triple Scalar Product Formula) If you are working with a CAS, use it to prove Formula 7 by showing that the difference 
between the two sides is zero. 



T6. (Volume of a Parallelepiped) See if you can program your technology utility to find the volume of the parallelepiped in 
3-space determined by vectors w, v, and w when the user enters the vectors. Check your work by solving Exercise 10 in 
Exercise Set 3.4 
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CHAPTER 



Euclidean Vector Spaces 



INTRODUCTION: The idea of using pairs of numbers to locate points in the plane and triples of numbers to locate points 
in 3-space was first clearly spelled out in the mid-seventeenth century. By the latter part of the eighteenth century, 
mathematicians and physicists began to realize that there was no need to stop with triples. It was recognized that quadruples 
of numbers (a\, fl2 , #3, ^4) cou ld be regarded as points in "four-dimensional" space, quintuples ( fll? ^ ^3, ^4, a 3) as P°i nts 
in "five-dimensional" space, and so on, an n-tuple of numbers being a point in " n-dimensional" space. Our goal in this chapter 
is to study the properties of operations on vectors in this kind of space. 
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4.1 

EUCLIDEAN /7-SPACE 



Although our geometric visualization does not extend beyond 3-space, it is 
nevertheless possible to extend many familiar ideas beyond 3-space by 
working with analytic or numerical properties of points and vectors rather than 
the geometric properties. In this section we shall make these ideas more 
precise. 



Vectors in r?-Space 

We begin with a definition. 



DEFINITION 



If n is a positive integer, then an ordered n-tuple is a sequence of n real numbers (a \ , a2, - - -, #«)• The set of all ordered 
n-tuples is called n-space and is denoted by R n . 



When n — 2 or 3, it is customary to use the terms ordered pair and ordered triple, respectively, rather than ordered 2-tuple and 
ordered 3 -tuple. When # = 1, each ordered n-tuple consists of one real number, so j? 1 may be viewed as the set of real numbers. 
It is usual to write R rather than R 1 for this set. 



It might have occurred to you in the study of 3-space that the symbol (a 1? # 2 , a 3) has two different geometric interpretations: it 
can be interpreted as a point, in which case a 1, a2> an ^ ^3 are the coordinates (Figure 4.1.1a), or it can be interpreted as a vector, 
in which case a 1, a2> an d ^3 are the components (Figure 4.1.1b). It follows, therefore, that an ordered n-tuple ( fll? tf 2 ---, a H ) can 
be viewed either as a "generalized point" or as a "generalized vector" — the distinction is mathematically unimportant. Thus we 
can describe the 5-tuple (-2, 4, 0, 1, 6) either as a point in g$ or as a vector in gX 



m 



y 



ta) 



(ff M fl 2 *ff 3 : 




Figure 4.1.1 



0) 



The ordered triple (# 1? a 2 , a 3) can t> e interpreted geometrically as a point or as a vector. 



DEFINITION 




Two vectors u = (u \ , U2, . - -, « w ) an d v = (v i , V2, . . ., v„) hi R n are called e#wtf/ if 

The swm u | v is defined by 

u+ \ = (u\ 4- vi,w 2 + V2,---,u yi + Vy l ) 
and if £ is any scalar, the scalar multiple k\\ is defined by 

k\\= (ku\, kii2, ..., kuy{) 



The operations of addition and scalar multiplication in this definition are called the standard operations on R™. 

The zero vector in R™ is denoted by and is defined to be the vector 

0=(0, 0,..., 0) 
If ii = (u i , ^2 ? . . ., u n ) is any vector in R™, then the negative (or additive inverse) of u is denoted by _ u and is defined by 

-u= (-hi, -U2>---> -"m) 
The difference of vectors in .S" is defined by 

v — u = v 4- ( — u) 
or, in terms of components, 

Y-u=(vi-u h v 2 -U2,-.,v yi -u yi ) 



Some Examples of Vectors in Higher-Dimensional Spaces 

Experimental Data A scientist performs an experiment and makes n numerical measurements each time the 
experiment is performed. The result of each experiment can be regarded as a vector y=(y\,y2,---,yn)' m R n ' m which 
y\,y2> • • •» yn are the measured values. 

* 
Storage and Warehousing A national trucking company has 15 depots for storing and servicing its trucks. At each 
point in time the distribution of trucks in the service depots can be described by a 15 -tuple x = (x\, X2, ---, *u) ^ n which 
x\ is the number of trucks in the first depot, X2 is the number in the second depot, and so forth. 

Electrical Circuits A certain kind of processing chip is designed to receive four input voltages and produces three 
output voltages in response. The input voltages can be regarded as vectors in R 4 and the output voltages as vectors in R 3 
. Thus, the chip can be viewed as a device that transforms each input vector v = (vi, V2, v^ V4) i n R 4 into some output 



3 



vector w = (wi,w 2 ,w 3 ) in R 



Graphical Images One way in which color images are created on computer screens is by assigning each pixel (an 
addressable point on the screen) three numbers that describe the hue, saturation, and brightness of the pixel. Thus, a 
complete color image can be viewed as a set of 5 -tuples of the form v=( k x r y r h,s,b)in which x and y are the screen 
coordinates of a pixel and h, s, and b are its hue, saturation, and brightness. 



Economics Our approach to economic analysis is to divide an economy into sectors (manufacturing, services, utilities, 
and so forth) and to measure the output of each sector by a dollar value. Thus, in an economy with 10 sectors the 
economic output of the entire economy can be represented by a 10-tuple s = ( s ^ ? $2, ---, £10) ^ n which the numbers s\,s\, 
. . ., s\q are the outputs of the individual sectors. 

Mechanical Systems Suppose that six particles move along the same coordinate line so that at time t their coordinates 

are x \ , * 2, • • • , x$ and their velocities are y 1 , y 2, . . . , v $, respectively. This information can be represented by the vector 

v = Oi, *2, X3, *4, x$, x 6 , vi, v 2 , v 3 , v 4 , vj, v 6 , t) 

in ^ 13 . This vector is called the state of the particle system at time t. 

Physics In string theory the smallest, indivisible components of the Universe are not particles but loops that behave 
like vibrating strings. Whereas Einstein's space-time universe was four-dimensional, strings reside in an 11 -dimensional 
world. 



Properties of Vector Operations in r?-Space 

The most important arithmetic properties of addition and scalar multiplication of vectors in J^ H are listed in the following 
theorem. The proofs are all easy and are left as exercises. 

THEOREM 4.1.1 



Properties of Vectors in R n 

Ifn=(ui,U2,---,Un)> v= (. v l> v 2>---> v n)> an ^ w = ( w 1 - w 2> - - -> w n) are vectors ^ n R n an d k and m are scalars, then: 

(a) u+v=v+u 

(b) u+(v + w) = (u + v)+w 

(c) u I = 0+u = u 

(d) u I ( - u) = 0; that is, u - u = 

(e) k(mu) = (km)u 

(f) jfc(u I v) = ku + kv 

(g) (k I W2)ll = £u4- mi 



(h) lu = u 



This theorem enables us to manipulate vectors in R n without expressing the vectors in terms of components. For example, to 
solve the vector equation x | u - v f° r x > we can add — n to both sides and proceed as follows: 

(X + U) ^ ( - ll) = V + ( - U) 

x 4- (u — u) = v — u 

x -f = v — n 
x = v — u 

The reader will find it instructive to name the parts of Theorem 4.1.1 that justify the last three steps in this computation. 

Euclidean r?-Space 

To extend the notions of distance, norm, and angle to R n , we begin with the following generalization of the dot product on R 2 
and R 3 [Formulas 3 and 4 of Section 3.3]. 



DEFINITION 




If u= {ui,U2,...,u n ) and v = (vi, v 2 , 


v M ) are an Y vectors in R n , then the Euclidean inner product u . v is defined by 

u-v = wivi + U2V2 + ... + w H v H 



Observe that when n = 2 or 3, the Euclidean inner product is the ordinary dot product. 



EXAMPLE 1 Inner Product of Vectors in f?4 



The Euclidean inner product of the vectors 

u=(-l,3,5,7) and v=(5, -4,7,0) 
in j? 4 is 

u-v=(-l)(5) i (3)(-4) I (5)(7) I (7)(0) = 18 

Since so many of the familiar ideas from 2-space and 3-space carry over to /z-space, it is common to refer to R n [ with the 
operations of addition, scalar multiplication, and the Euclidean inner product, as Euclidean n-space. 

The four main arithmetic properties of the Euclidean inner product are listed in the next theorem. 
THEOREM 4.1.2 



Properties of Euclidean Inner Product 

Ifu y v, and w are vectors in R n and k is any scalar, then: 



(a) u-v = v-u 

(b) (u 4- v) - w = n - w I v - w 

(c) (Au)-v = Jt(u-v) 

(d) v - v > 0. Further, v - v = if and only ify = §. 



We shall prove parts (b) and (d) and leave proofs of the rest as exercises. 



Application of Dot Products to ISBNs 

Most books published in the last 25 years have been assigned a unique 10-digit number called an International Standard 
Book Number or ISBN. The first nine digits of this number are split into three groups — the first group representing the 
country or group of countries in which the book originates, the second identifying the publisher, and the third assigned to the 
book title itself. The tenth and final digit, called a check digit, is computed from the first nine digits and is used to ensure that 
an electronic transmission of the ISBN, say over the Internet, occurs without error. 

To explain how this is done, regard the first nine digits of the ISBN as a vector b in ^ 9 , and let a be the vector 

a = (1,2,3, 4,5,6,7,3,9) 
Then the check digit c is computed using the following procedure: 

1. Form the dot product ^ . b. 

2. Divide ^ . b by 1 1, thereby producing a remainder c that is an integer between and 10, inclusive. The check digit is 
taken to be c, with the proviso that c = 1Q is written as X to avoid double digits. 

For example, the ISBN of the brief edition of Calculus, sixth edition, by Howard Anton is 

0-471-15307-9 
which has a check digit of 9. This is consistent with the first nine digits of the ISBN, since 

a b = (1,2, 3, 4, 5, 6, 7, 8, 9) -(0,4, 7, 1, 1, 5, 3, 0, 7) = 152 

Dividing 152 by 1 1 produces a quotient of 13 and a remainder of 9, so the check digit is c = 9. If an electronic order is placed 
for a book with a certain ISBN, then the warehouse can use the above procedure to verify that the check digit is consistent 
with the first nine digits, thereby reducing the possibility of a costly shipping error. 



Proof (b) Let u = (u h u 2 , ..., a„), v = (vi, v 2 , ..., v„), and w = ( Wh W2? ._., Wn ). Then 

(u + v) w= (u\ + vi, u 2 -\- V2, ..., w H -h v H ) - (wi, W2, ■■■- ™h) 

= (hi 4 vi)wi f (u 2 + v 2 )^2 + --- 4- (u„ 4- v H )w H 

= (u\w\ -hu 2 W2-\- ----\- u n Wn) 4 (vi^i I v 2 ^2 + --- + ^h™h) 
= u - w + v - w 



Proof (d) We have v -v = v? + v^+ + v 2 > 0- Further, equality holds if and only ifvi=v2= — = v M = — that is, if and 
only if v = 0- 



EXAMPLE 2 Length and Distance in fl 4 



Theorem 4.1.2 allows us to perform computations with Euclidean inner products in much the same way as we perform them 
with ordinary arithmetic products. For example, 

(3u I 2v) - (4u + v) = (3u) - (4u I v) I (2v) - (4u + v) 

= (3u)-(4u) I (3u)-v+(2v)-(4u) | (2v)-v 

= 12(u-u) I 3(u-v) I 8(v-u) I 2(v-v) 

= 12(u-u) I ll(u-v) I 2(v-v) 

The reader should determine which parts of Theorem 4.1.2 were used in each step. 
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Norm and Distance in Euclidean r?-Space 

By analogy with the familiar formulas in ^ 2 and ^ 3 , we define the Euclidean norm (or Euclidean length) of a vector 

u=c«i,«2-»-««) in ^ Mb y 

||u|| = (u-u) 1/2 = ^ fu 2 + .,. + u 2 (1) 

[Compare this formula to Formulas 1 and 2 in Section 3.2.] 
Similarly, the Euclidean distance between the points u = (tt\ 7 uj, ---, u n ) and y = (y 1? vj, ---, v M ) in R n is defined by 

d(u, v) = ||u-v|| = l/(ui-vi) 2 i ( W2 - V2 ) 2 +-+(w H -v H ) 2 (2) 

[See Formulas 3 and 4 of Section 3.2.] 



EXAMPLE 3 Finding Norm and Distance 

If u = (1, 3, — 2, 7) an d v = (0, 7, 2, 2), then in the Euclidean space j? 4 , 

I|u|| = i/(l) 2 +(3) 2 +(-2) 2 I (7) 2 = ^63 = 3^ 
and 

d(\\, v) = i/(l-0) 2 j (3-7) 2 j (-2-2) 2 i (7-2) 2 = i/58 
The following theorem provides one of the most important inequalities in linear algebra: the Cauchy-Schwarz inequality. 
THEOREM 4.1.3 



Cauchy-Schwarz Inequality in r* 



If\\= {u\,U2,---,u n )andv= ( Vl , V2 ? ..., v„) are vectors in R n , then 

|u-v|<||u||||v|| ^ 



In terms of components, 3 is the same as 

jwivi + w 2 v 2 H hw H v H |< (u{ + u% H h^) (vf +v^ H hv^) (4) 



2 2 2-, 1/2 / 2,2, 2-. 1/2 



We omit the proof at this time, since a more general version of this theorem will be proved later in the text. However, for 
vectors in p^ and ^ 3 , this result is a simple consequence of Formula 1 of Section 3.3: If u and v are nonzero vectors in p^ or pi, 



then 



|u - v| = | ||u|| || v || cos 0| = ||u|| || v|| |cos 9\ < ||u|| || v|| 



and if either u = or v = 0, then both sides of 3 are zero, so the inequality holds in this case as well. 
The next two theorems list the basic properties of length and distance in Euclidean rc-space. 




Augustin Louis (Baron de) Cauchy 



Augustin Louis (Baron de) Cauchy (1789-1857), French mathematician. Cauchy's early education was acquired from his 
father, a barrister and master of the classics. Cauchy entered L'Ecole Polytechnique in 1805 to study engineering, but because 
of poor health, he was advised to concentrate on mathematics. His major mathematical work began in 181 1 with a series of 
brilliant solutions to some difficult outstanding problems. 

Cauchy's mathematical contributions for the next 35 years were brilliant and staggering in quantity: over 700 papers filling 
26 modern volumes. Cauchy's work initiated the era of modern analysis; he brought to mathematics standards of precision 
and rigor undreamed of by earlier mathematicians. 

Cauchy's life was inextricably tied to the political upheavals of the time. A strong partisan of the Bourbons, he left his wife 
and children in 1830 to follow the Bourbon king Charles X into exile. For his loyalty he was made a baron by the ex-king. 
Cauchy eventually returned to France but refused to accept a university position until the government waived its requirement 
that he take a loyalty oath. 



It is difficult to get a clear picture of the man. Devoutly Catholic, he sponsored charitable work for unwed mothers and 
criminals and relief for Ireland. Yet other aspects of his life cast him in an unfavorable light. The Norwegian mathematician 
Abel described him as "mad, infinitely Catholic, and bigoted." Some writers praise his teaching, yet others say he rambled 
incoherently and, according to a report of the day, he once devoted an entire lecture to extracting the square root of seventeen 
to ten decimal places by a method well known to his students. In any event, Cauchy is undeniably one of the greatest minds 
in the history of science. 




Herman Amandus Schwarz 



Herman Amandus Schwarz (1843-1921), German mathematician. Schwarz was the leading mathematician in Berlin in the 
first part of the twentieth century. Because of a devotion to his teaching duties at the University of Berlin and a propensity for 
treating both important and trivial facts with equal thoroughness, he did not publish in great volume. He tended to focus on 
narrow concrete problems, but his techniques were often extremely clever and influenced the work of other mathematicians. 
A version of the inequality that bears his name appeared in a paper about surfaces of minimal area published in 1885. 



THEOREM 4.1.4 



Properties of Length in R n 

Ifu and v are vectors in R n and k is any scalar, then: 

(a) IN|>0 

(b) ||u|| = if and only ifu = Q 

( C ) ||taH = |*|IHI 

(d) 1 1 ii + v|| < 1 1 ii 1 1 4- ||v|| (Triangle inequality) 



We shall prove (c) and (d) and leave (a) and (b) as exercises. 



Proof (c) If u = (u\, u 2 , ..., u n ), then ka= {ku\, ku 2 , ..., ht„), so 



\\ka\\ = l/(^i) 2 I (ku 2 ) 2 + -+(ku„) 2 



= 1*1^1 +"2 
= |*|l|u|| 



+ ■■■■■ I u\ 



Proof (d) 

||u + v|| 2 = (u + v) - (u I v) = (u ■ u) I 2 (u ■ v) I (v - v) 

= ||u|| 2 H 2(u-v) I ||v|| 2 

2 2 

< ||u|| 4- 2|u- v| -f ||v|| ^— Property of absolute value 

2 2 

< ||u|| +2||u||||v|| + ||v|| ^— Cauchy— Schwarz inequality 

= (IHI + ||v||) 2 

The result now follows on taking square roots of both sides. 

I 

Part (c) of this theorem states that multiplying a vector by a scalar k multiplies the length of that vector by a factor of \k\ (Figure 
4.1.2a). Part (d) of this theorem is known as the triangle inequality because it generalizes the familiar result from Euclidean 
geometry that states that the sum of the lengths of any two sides of a triangle is at least as large as the length of the third side 
(Figure 4. 1. 2b). 




<«) |UnJ|*|*lllii| 



It - \ 




(/>) llu + v||< Hull + IM 
Figure 4.1.2 

The results in the next theorem are immediate consequences of those in Theorem 4.1.4, as applied to the distance function 
d(\i, v) on R n . They generalize the familiar results for R 2 and j? 3 . 

THEOREM 4.1.5 



Properties of Distance in R n 

Ifu, v, and w are vectors in R n and k is any scalar, then: 


(a) 


d(n, 


v) >0 












(b) 


d(u, 


v) = if and only ifu = v 










(c) 


d(u. 


v) =df(v,u) 








(d) 


d(n, 


v) < d(v, w) 4- l^(w ? v) (Triangle inequality) 









We shall prove part (J) and leave the remaining parts as exercises. 



Proof (d) From 2 and part (d) of Theorem 4. 1 .4, we have 



dfCu,v) = ||u-v|| = ||Cu-w) I (w-v)|| 

< ||u — w|| + ||w-v|| =d(\\,w) +d(w,v) 



Part (d) of this theorem, which is also called the triangle inequality, generalizes the familiar result from Euclidean geometry that 
states that the shortest distance between two points is along a straight line (Figure 4.1.3). 




Figure 4.1.3 

Formula 1 expresses the norm of a vector in terms of a dot product. The following useful theorem expresses the dot product in 
terms of norms. 



THEOREM 4.1.6 



u-v=i||u I v|| 2 -I||u-v|| 2 (6) 



Proof 

||u + v|| 2 = (u + v)-(u + v) = ||u|| 2 + 2(u-v) I ||v|| 2 
||u-v|| 2 = (u-v)-(u-v) = ||u|| 2 -2(u-v) I ||v|| 2 

from which 6 follows by simple algebra. 

Some problems that use this theorem are given in the exercises. 

Orthogonality 

Recall that in the Euclidean spaces R 2 and R 3 , two vectors u and v are defined to be orthogonal (perpendicular) if u . v = 
(Section 3.3). Motivated by this, we make the following definition. 



DEFINITION 



Two vectors u and v in R n are called orthogonal if u . y = Q. 



EXAMPLE 4 Orthogonal Vectors in R* 



In the Euclidean space R^ the vectors 

u=(-2,3, 1,4) and v= (1,2,0, -1) 
are orthogonal, since 

u-v=(-2)(l) I (3)(2) I (1)(0) I (4)(-l) = 

Properties of orthogonal vectors will be discussed in more detail later in the text, but we note at this point that many of the 
familiar properties of orthogonal vectors in the Euclidean spaces R 2 an( j ^3 continue to hold in the Euclidean space R™. For 
example, if u and v are orthogonal vectors in R 2 or R 3 , then w, v, and u _| v form the sides of a right triangle (Figure 4.1.4); thus, 
by the Theorem of Pythagoras, 

||u + v|| 2 = ||u|| 2 +||v|| 2 



The following theorem shows that this result extends to R™. 



II + V 




Figure 4.1.4 



THEOREM 4.1.7 



Pythagorean Theorem in R n 

Ifu and v are orthogonal vectors in R n with the Euclidean inner product, then 



|u + v|| 2 = ||u|| 2 + ||v|| 2 



Proof 



|u + v|| 2 =(u + v)-(u + v) = ||u|| 2 + 2(u-v) I ||v|| 2 = ||u|| 2 + ||v|| 2 



Alternative Notations for Vectors in R n 



It is often useful to write a vector u — (^ 1? U2f _._, u n ) in R n in matrix notation as a row matrix or a column matrix: 

or n= [u\ &2 --- u n\ 



n = 



"2 



Uy 



This is justified because the matrix operations 



n 4- v = 



"«r 




"vi" 




"2 


+ 


V2 


= 


«H 




v H 





u\ +vi 
W2 + V2 

w H I v H 



£u = ,fc 



~"1~ 




jfcwi 


"2 


= 


^2 


"h 




_^ H _ 



or 

u + v=[wi u 2 ... u n ] + [vi v 2 ... v„] 
= [ui+vi w 2 + v 2 ... w H H-v H ] 
jfcu = £[wi ^2 --- w h] — [^1 ^2 --- k&n] 
produce the same results as the vector operations 

u + v= (zq, M2----- w h) I ( v l> V2, .-., v H ) = (ui -h vi, w 2 -h v 2 , ---, w H 4- v H ) 
jtii = jt(«i, &2> ---> w h) — C^h ^2> ---> ^h) 
The only difference is the form in which the vectors are written. 



A Matrix Formula for the Dot Product 



If we use column matrix notation for the vectors 



n = 






and v = 



^2 



and omit the brackets on \ x 1 matrices, then it follows that 

"«1 



v u = [vi v 2 ... v„] 



^2 



= [UIVI +W 2 V 2 -h h^ H V H ] = [u-v] =U-V 



Thus, for vectors in column matrix notation, we have the following formula for the Euclidean inner product: 



For example, if 



then 



U-V = V 11 



11 = 



and v = 



5 

-4 

7 





u-v = v i u=[5 -4 7 0] 



= [18] = 18 



(7) 



If A is an M x n matrix, then it follows from Formula 7 and properties of the transpose that 



Au-v = 1?* (An) = (v 1 A)u= (A\) u = u-A i y 
u-Ay=(Ay) T u=(y T A T )u = y T (A T u)=A T u-y 



The resulting formulas 



Au-y = u-A v 



u-Ay = A i u-y 
provide an important link between multiplication by an # x n matrix A and multiplication by ^4 7j. 



(8) 
(9) 



EXAMPLE 5 Verifying That Au ■ v = u ■ A T v 



Suppose that 



A = 



1 


-2 3" 




" -l" 




" -2" 


2 


4 1 


, u = 


2 


, v = 





1 


1 




4 




5 



Then 



An = 



A T v = 



1 


-2 3" 


-1 




7 


2 


4 1 


2 


= 


10 


1 


1 


4 




5 



1 2 


-1" 


" -2 




~ -1~ 


-2 4 








= 


4 


3 1 


1 


5 




-1 



from which we obtain 

J 4u-v = 7(-2) I 10(0) -h 5(5) = 11 

u-^ r v=(-l)(-7) i 2(4) i 4(— 1) = 11 
Thus j^ u . Y = \i-A ^v as guaranteed by Formula 8. We leave it for the reader to verify that 9 also holds. 

A Dot Product View of Matrix Multiplication 

Dot products provide another way of thinking about matrix multiplication. Recall that if A = [a^ ] is an m x r matrix and 
B = [hji] is an ^ x w matrix, then the ijth entry of AB is 

fljl&lj I a i2 b 2 j \-'" + a ir b T j 
which is the dot product of the ith row vector of A 

[a n a n --- a i7 ] 
and thejth column vector of B 

V 
b 2j 

b rj 

Thus, if the row vectors of A are t\,T2, ...,r ffl and the column vectors of B are c i , C2, • • • , c n , then the matrix product AB can be 
expressed as 

"ri-ci ri-c 2 ... ri-f H " 
r 2 -ci r 2 -c 2 ... r 2 -c H 



,45 = 



r m -ci i^-C2 



T m ' c h 



(10) 



In particular, a linear system Ax = h can be expressed in dot product form as 



" ri - x " 




v 


r2-x 


= 


*2 


im-x 




^m 



(11) 



where r 1? r 2 , ..., r m are the row vectors of A, and £ 1? £ 2 > •••> i m are ^e entries of 6. 



EXAMPLE 6 A Linear System Written in Dot Product Form 



The following is an example of a linear system expressed in dot product form 11. 



System 

3*i —4*2 + *3 = 1 

2*1 —7*2 — 4*3 = 5 

*1 + 5*2 — 8*3 = 



Dot Product Form 

(3, -4, 1) - (*i,*2, * 3 ) 
(2, -7, -4) - (*i,*2, *3) 
(1, 5, -8) ■ (*l,*2, *3) 



Exercise Set 4.1 



® 



Click here for Just Ask! 



Let u = (-3, 2,1,0), v= (4,7, - 3, 2), and w = (5, -2,8, l).Find 



(a) v- 



w 



2. 



4. 



(b) 2u | 7v 

(c) -u | (v-4w) 

(d) 6(u-3v) 

(e) -v-w 

(f) (6v-w)-(4u + v) 

Let u, v, and w be the vectors in Exercise 1. Find the vector x that satisfies 5 X — 2v = 2(w — 5x)- 

Let m = ( - 1, 3, 2, 0), u 2 = (2, 0, 4, - 1), u 3 = (7, 1, 1, 4), and u 4 = (6, 3, 1, 2)- Find scalars c\, c 2 , c 3 , and c 4 such that 
ciui +C2U2+C3U3+C4U4= (0, 5, 6, -3)- 

Show that there do not exist scalars c\, c 2 , and c-$ such that 
ci(l. 0, 1, 0) I c 2 {\, 0, - 2, 1) I c 3 (2, 0, 1. 2) = (1, - 2, 2, 3) 

In each part, compute the Euclidean norm of the vector. 



(a) (-2,5) 

(b) (1,2,-2) 



(c) (3,4,0,-12) 

(d) (-2,1,1,-3,4) 



Let u = (4, 1, 2, 3), v = (0, 3, 8, - 2), and w = (3, 1, 2, 2)- Evaluate each expression. 
6. 



(a) l|u + v|| 

(b) INI I IMI 

(C ) ||-2u|| + 2||u|| 

(d) ||3u-5v + w|| 



(e) tV 



<f) "w" 1 



Show that if v is a nonzero vector in R n , then (1 / ||v||)v has Euclidean norm 1. 
7. 



Let v = ( — 2 3 6)- Find all scalars k such that ||£v|| = 5. 
8. 



Find the Euclidean inner product u - v- 
9. 



(a) u=(2,5),v=(-4,3) 

(b) u=(2,8,2),v=(0, 1,3) 

(c) u=(3, 1,4, -5),v=(2, 2, -4,-3) 

(d) u=(-l, 1,0,4, -3),v=(-2, -2,0,2, - 1) 



10. 

(a) Find two vectors in p^- with Euclidean norm 1 whose Euclidean inner product with (3, -1) is zero. 

(b) Show that there are infinitely many vectors in $} with Euclidean norm 1 whose Euclidean inner product with (1,-3, 
5) is zero. 



Find the Euclidean distance between u and v. 
11. 

(a) u = (1, -2),v=(2, 1) 

(b) u= (2, -2, 2),v=(0,4, -2) 

(c) u= (0, -2, -1, l),v=(-3, 2,4,4) 

(d) u=(3, -3, -2,0, -3),v=(-4, 1, -1,5,0) 

Verify parts (b), (e), if), and (g) of Theorem 4. 1 . 1 for u = (2, 0, - 3, 1), v= (4, 0, 3, 5), w= (1, 6, 2, -\),k = 5, and 
12 -/=-3- 

Verify parts (b) and (c) of Theorem 4. 1 .2 for the values of u, v, w, and k in Exercise 12. 
13. 

In each part, determine whether the given vectors are orthogonal. 
14. 

(a) u=(-l,3,2),v=(4,2, -1) 

(b) u=(-2, -2, -2),v=(l, 1,1) 

(c) n=(ui,u 2 ,u 3 ),v =(0,0,0) 

(d) u= (_4 i 6 j _10, l),v=(2, 1, -2,9) 

(e) u=(0, 3, -2, l),v=(5, 2, -1,0) 

(f) n=(a,b),v=(-b,a) 

For which values of k are u and v orthogonal? 
15. 



(a) u=(2, l,3),v=(l,7,£) 



(b) xi=(k,k,\),v=(k,5,6) 



16. 



Find two vectors of norm 1 that are orthogonal to the three vectors u = (2, 1, — 4, 0), v = ( — 1, — 1,2,2), and 
w=(3, 2,5,4) • 



17. 



In each part, verify that the Cauchy-Schwarz inequality holds. 



(a) u=(3,2),v=(4, -1) 



(b) u=(-3, l,0),v=(2, -1,3) 



(c) u= (-4, 2, l),v=(8, -4, -2) 



(d) u=(0, -2, 2, l),v=(-l, -1,1, 1) 



18. 



In each part, verify that Formulas 8 and 9 hold. 



(a) 



A = 



"2 


-1" 


, u = 


"3" 


, y — 


"-2" 


_3 


4_ 




_1_ 




6_ 



(b) 



A = 



1 


2 4" 




"-1" 




0" 


3 


1 


, u = 


2 


,v = 


2 


5 


-2 3 




5 




-4 



19. 



Solve the following linear system for x\, X2, and ^3. 

(1, -l,4)-(x h x 2 ,x 3 ) = l0 

(3,2,0)-(x u x 2 ,x 3 ) =1 

(4, -5, -\)-(x h x 2 ,x 3 ) =1 



20. 



Find u - v given that ||u + v|| = 1 and ||u — v|| = 5. 



Use Theorem 4.1.6 to show that u and v are orthogonal vectors in R" if ||u | v|| = ||n — v||. Interpret this result 
21- geometrically in g2. 



The formulas for the vector components in Theorem 3.3.3 hold in R n as well. Given that a = ( _ \ r 1, 2, 3) and 
^' u = (2, 1 , 4, — 1 ) > find the vector component of u along a and the vector component of u orthogonal to a. 

Determine whether the two lines 

23 ' r=(3,2,3, -1) I *(4,6,4, -2) and r= (0,3, 5,4) | s(l, -3, -4, -2) 

intersect in R 4 . 

Prove the following generalization of Theorem 4.1.7. If y\ 9 v 2 , . • ., v r are pairwise orthogonal vectors in R™, then 
24. 

||v 1 +v2+-" + v,|| 2 =||vi|| 2 +||v 2 || 2 +»-+||v,|| 2 



25. 



Prove: If u and v are M x 1 matrices and A is an M x m matrix, then 

r-|-i r-|-i ^ r-|-i r-|-i r-|-i r-|-i 

(yMMu) <(n^Mu)(vMMy) 



Use the Cauchy-Schwarz inequality to prove that for all real values of a, b, and ft 

(a cos 0-1 £sin0) 2 <a 2 + £ 2 

Prove: If w, v, and w are vectors in R n and k is any scalar, then 
27. 

(a) u (Ay) =i(u-v) 

(b) ii - (v + w) = u - v + u - w 

Prove parts (a) through (d) of Theorem 4.1.1. 
28. 

Prove parts (e) through (h) of Theorem 4.1.1. 
29. 

Prove parts (a) and (c) of Theorem 4.1.2. 
30. 

Prove parts (a) and (b) of Theorem 4.1.4. 
31. 

Prove parts (a), (b), and (c) of Theorem 4.1.5. 
32. 

Suppose that a \ , a 2 , • • • > fl H are positive real numbers. In £ 2 , the vectors vi = (a i , 0) and V2 = (0, a 2 ) determine a rectan^ 
33 - of area ^ = a{a2 (see the accompanying figure), and in R 3 , the vectors Vl = (a\, 0, 0), v 2 = (0, a 2 , 0), and V3 = (0, 0, a 3 

determine a box of volume V = a\a2^3 ( see ^ e accompanying figure). The area A and the volume V are sometimes calle( 
the Euclidean measure of the rectangle and box, respectively. 

(a) How would you define the Euclidean measure of the "box" in R n that is determined by the vectors 

vi = Oi, 0, 0, ..., 0) v 2 = (0, a 2 , 0, ..., 0), ..., v H = (0, 0, 0, ..., a H ) ? 



(b) How would you define the Euclidean length of the "diagonal" of the box in part (a)? 



t 

«U 2 ) A 



**X|,U) 



^,0,0} 



A 






j^- 




/ 


(O.ftj.O) 



Area /I = ii\ti2 

Figure Ex-33 



Volume V — &1&2&3 



Discussion 
Discov&ry 



34. 



(a) Suppose that u and v are vectors in R n . Show that 



35. 



||u + v|| 2 +||u-v|| 2 = 2(|[u|| 2 I ||v|| 2 ) 



(b) The result in part (a) states a theorem about parallelograms in R 2 . What is the theorem? 



(a) If u and v are orthogonal vectors in R n such that ||u|| = 1 and ||v|| = 1, then d(u, v) = 



(b) Draw a picture to illustrate this result. 



In the accompanying figure the vectors w, v, and u _ v form a triangle in R 2 , and ft denotes the 
^"' angle between u and v. It follows from the law of cosines in trigonometry that 

||u- v || 2 = ||u|| 2 H ||v|| 2 -2||ii||||v||cos0 

Do you think that this formula still holds if u and v are vectors in R n ? Justify your answer. 




Figure Ex-36 



Indicate whether each statement is always true or sometimes false. Justify your answer by giving 
37. a logical argument or a counterexample. 



(a) If || u 4. v || 2 = ||u|| 2 + ||v|| 2 > then u and v are orthogonal. 

(b) If u is orthogonal to v and w, then u is orthogonal to v | w . 

(c) If u is orthogonal to v I ^v, then u is orthogonal to v and w. 

(d) If ||u - v|| = 0, then u = v. 

(e) If ||iu||=fc||u||,1henjt>0. 
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4.2 

LINEAR 
TRANSFORMATIONS 

FROM R n TO R m 



In this section we shall begin the study of functions of the form w = F(x), 
where the independent variable x is a vector in R n and the dependent variable 
w is a vector in R m . We shall concentrate on a special class of such functions 
called "linear transformations. " Linear transformations are fundamental in the 
study of linear algebra and have many important applications in physics, 
engineering, social sciences, and various branches of mathematics. 



Functions from R n to R 

Recall that a function is a rule/that associates with each element in a set A one and only one element in a set B. Iff associates 
the element b with the element a, then we write b = f(a) and say that b is the image of a under/or that f (a) is the value off at 
a. The set A is called the domain off and the set B is called the codomain off The subset of B consisting of all possible values 
for/ as a varies over A is called the range off For the most common functions, A and B are sets of real numbers, in which case/ 
is called a real-valued function of a real variable. Other common functions occur when B is a set of real numbers and A is a set 
of vectors in R 2 , j? 3 , or, more generally, R n . Some examples are shown in Table 1. Two functions f 1 and f 2 are regarded as 
equal, written f j = f 2 , if they have the same domain and f \(a) = f 2 (#) for all a in the domain. 

Table 1 



Formula 


Example 






Classification 


Description 


/(*) 


f(x)=* 2 






Real-valued function of a real variable 


Function from RtoR 


f(*,y) 


f(x,y)=x 2 +y 2 






Real-valued function of two real 
variables 


Function from r} to 
R 


f(*,y,z) 


f(x,y,z)=x 2 +y 2 +z 2 






Real-valued function of three real 
variables 


Function from R^ to 
R 


/Ol,*2,--* 


«) f(xi,X2,---,Xn)=X\+X2 


+ ■ 


" + *« 


Real-valued function of n real variables 


Function from R n to 
R 



Functions from r^ to r™ 

If the domain of a function/is R™ and the codomain is R m (m and n possibly the same), then/is called a map or transformation 
from ,£" to ^ m , and we say that the function fmaps R n into R™. We denote this by writing / :R n — * R m . The functions in 
Table 1 are transformations for which m = \. In the case where m = w , the transformation / :R n — * R n is called an operator on 
j? M . The first entry in Table 1 is an operator on R. 

To illustrate one important way in which transformations can arise, suppose that f j_ , f 2 , . . . , / m are real-valued functions of n 
real variables, say 

™1 =/l(*l.*2.----*n) 
™2 =/2(*1,*2>---.*m) 



(1) 



Wm=/m(*l.*2--»>*M) 
These m equations assign a unique point ( Wl? W2 , ---, w m ) i n ^ m t0 eac h point (* i, ^ ---> *m) ^ n ^" anc * ^ us define a 



transformation from R n to R™. If we denote this transformation by T, then T\R n — *■ R™ and 

7*0 1, x 2 , ---, x M ) = (wi, w 2 , ---, w m ) 



EXAMPLE 1 A Transformation from R 2 to fl 3 



The equations 



^2 = 3^1^2 

2 2 
W2 =^i —-^2 



define a transformation 7; J? 2 * R^. With this transformation, the image of the point Oi, ^ 2 ) * s 

roi 

Thus, for example, 7(1, - 2) = ( - 1, -6, - 3) 



F(*1>*2) = (*1 I *2, 3*1*2, *?-*2) 



Linear Transformations from fl n to R m 

In the special case where the equations in 1 are linear, the transformation T\R n — * ,S m defined by those equations is called a 
linear transformation (or a linear operator if ^ — M ). Thus a linear transformation T: .S" — * ,S m is defined by equations of the 
form 

wi=tfii*i 1^12^2+ — ^^ElH^H 
™2=^21*1 I ^22^2 4- h^2H^H 



or, in matrix notation, 



or more briefly by 



^m=^ml^l I fl m 2*2+"" + flmH*H 



n?l 
m?2 



an a n 

^21 ^22 



w = ^x 



fll H " 


~*i~ 


^2h 


*2 


fl ™ 


* H 



The matrix ^4 = [a^] is called the standard matrix for the linear transformation T, and 7 is called multiplication by A. 



(2) 



(3) 



(4) 



EXAMPLE 2 A Linear Transformation from /? 4 to fl 3 



The linear transformation T:R 4 * R 3 defined by the equations 

Wl = 2*i — 3^2 I ^3 — 5*4 
>t>2 — 4*1 + ^2 — 2^3 I *4 
w^ = 5x\ — %2 I" 4^3 

can be expressed in matrix form as 



(5) 






2 


-3 


1 


4 


1 


-2 


5 


-1 


4 



*1 

*4 



(6) 



so the standard matrix for T is 



A = 



2-3 1-5 
4 1-21 
5-140 



The image of a point (x 1; X2 , x?, X4) can be computed directly from the defining equations 5 or from 6 by matrix multiplication. 
For example, if ( X \, X2, X3, ^4) = (1, — 3, 0, 2), then substituting in 5 yields 

w\ = 1, vi?2 = 3, W3 = 8 
(verify) or alternatively from 6, 



wi 




2 


-3 


1 


-5 


W2 


^4 = 


4 


1 


-2 


1 


W3 




5 


-1 


4 






r ii 








1 


-3 








= 


1 







R 


L 2 







Some Notational Matters 

If 7:^" — ► ,S m is multiplication by A, and if it is important to emphasize that A is the standard matrix for T, we shall denote the 
linear transformation T:R n — ► R™ by Tj[.R n — > £ m . Thus 



It is understood in this equation that the vector x in R n is expressed as a column matrix. 

Sometimes it is awkward to introduce a new letter to denote the standard matrix for a linear transformation T\R™ — ► R m . In 
such cases we will denote the standard matrix for T by the symbol [ T] .With this notation, equation 7 would take the form 



(7) 



t(x) = \tu 

Occasionally, the two notations for a standard matrix will be mixed, in which case we have the relationship 

[Ta\=A 



(8) 



(9) 



Remark Amidst all of this notation, it is important to keep in mind that we have established a correspondence between mx « 
matrices and linear transformations from R n to R m : To each matrix A there corresponds a linear transformation 7*^ 
(multiplication by A), and to each linear transformation T:R n — ► R m , there corresponds an m x « matrix [T] (the standard 
matrix for T). 

Geometry of Linear Transformations 



Depending on whether /r-tuples are regarded as points or vectors, the geometric effect of an operator T\R n 
transform each point (or vector) in R™ into some new point (or vector) (Figure 4.2.1). 



R™ IS tO 



i, ^nx) 



\ 



{a) 7" maps points to point* 




(h) T maps vectors to vectors 
Figure 4.2.1 



EXAMPLE 3 Zero Transformation from R n to R m 



If is the W x tf zero matrix and is the zero vector in R n , then for every vector x in R n , 

so multiplication by zero maps every vector in R n into the zero vector in R m . We call Xq the zero transformation from R n to R m 
. Sometimes the zero transformation is denoted by 0. Although this is the same notation used for the zero matrix, the appropriate 
interpretation will usually be clear from the context. 



EXAMPLE 4 Identity Operator on R n 



If / is the tf x tf identity matrix, then for every vector x in R n , 

Ti(x)=Ix = x 
so multiplication by / maps every vector in R n into itself. We call Xj the identity operator on R n . Sometimes the identity 
operator is denoted by /. Although this is the same notation used for the identity matrix, the appropriate interpretation will 
usually be clear from the context. 



Among the most important linear operators on R 2 and J? 3 are those that produce reflections, projections, and rotations. We shall 
now discuss such operators. 

Reflection Operators 



Consider the operator T:R 2 * R 2 that maps each vector into its symmetric image about the j-axis (Figure 4.2.2). 



^y 


<-v,v! 


l.v- vl 


w = m^N. 


*- *- 



Figure 4.2.2 

If we let w = T(x)> then the equations relating the components of x and w are 

W\= —x = —x I O7 

or, in matrix form, 



W2 



1 
1 



Since the equations in 10 are linear, 7 is a linear operator, and from 1 1 the standard matrix for 7 is 

1 0" 



[T] = 



1 



(10) 



(11) 



In general, operators on pi and pi that map each vector into its symmetric image about some line or plane are called reflection 
operators. Such operators are linear. Tables 2 and 3 list some of the common reflection operators. 

Table 2 



Operator 



Illustration 



Equations Standard Matrix 



Reflection about the j-axis 



Reflection about the x-axis 






tf U..O 




wss7'(*> 






wi = — X 


-1 




W2= y 


1 


v 












W\ = X 


"1 




^2= -y 


-1 



Reflection about the line y = x 




v?2 = x 



1 

1 



Table 3 



Operator 



Illustration 



Equations Standard Matrix 



Reflection about the xy-plane 




h&i?fZ) 



y>2= y 

w% = —z 



1 


0" 


1 








-1 



BE v. -s) 



Reflection about the ^-plane 



<k. - V, tj 



i.r.y.z) 




W\ = 


JT 


1 





VP2 = 


-y 





-1 


W3 = 


z 





1 



Reflection about the yz-plane 




C-jr, y.z) 



(fc'JW) 



wi = 


— x 


-1 








W2 = 


7 





1 





W3 = 


z 








1 



Projection Operators 

Consider the operator X: R 2 > R 2 that maps each vector into its orthogonal projection on the x-axis (Figure 4.2.3). The 

equations relating the components of x and w = T(x) are 

w i = x = x + Oy 



w 2 = = 0* + Oy 



or, in matrix form, 



w?2 



1 




(12) 



(13) 




Figure 4.2.3 



The equations in 12 are linear, so T is a linear operator, and from 13 the standard matrix for T is 

1 0" 



[T] = 







In general, ^projection operator (more precisely, an orthogonal projection operator) on R 2 or fi 3 is any operator that maps 
each vector into its orthogonal projection on a line or plane through the origin. It can be shown that such operators are linear. 
Some of the basic projection operators on ^ 2 and j? 3 are listed in Tables 4 and 5. 



Table 4 
















Operator 


Illustration 


Equations ! 


Standard Matrix 


Orthogonal projection on the x-axis 




w 2 = 


"1 
_0 


0" 
0_ 






w ^ 




Orthogonal projection on the y-axis 




vt?l = 

W2=y 


"0 
_0 


0" 

1_ 













Table 5 

Operator 



Illustration 



Equations Standard Matrix 



Orthogonal projection on the xy -plane 



iK.\.:\ 




w\ = x 

y?2=y 

w^ = 



"1 





0" 





1 















&vv,<)> 



Orthogonal projection on the ^-plane ( tt . i: ^ Vi _, 




"1 





0" 

















1 



Orthogonal projection on the yz-plane 






w\ = 
W3 =z 



"0 





0" 





1 











1 



Rotation Operators 

An operator that rotates each vector in j? 2 through a fixed angle {) is called a rotation operator on j? 2 . Table 6 gives the formula 
for the rotation operators on j? 2 . To show how this is derived, consider the rotation operator that rotates each vector 
counterclockwise through a fixed positive angle {}. To find equations relating x and w = T(x), let 6 be the angle from the 
positive x-axis to x, and let r be the common length of x and w (Figure 4.2 .4). 




Figure 4.2.4 



Table 6 



Operator 


Illustration 




Equations ! 


Standard Matrix 


Rotation through an angle {} 


j 




X 


>^1 =* cos — y sin 6? 
>i>2 = * sin 3 + ^ cos 9 


cos0 — sin0 
sin0 cos0 















Then, from basic trigonometry, 



x = rcostb, y = rsm4> 



(14) 



and 



w i = r cos (9 + <p) , >t?2 — ^ sm (^ + ^) 



(15) 



Using trigonometric identities on 15 yields 

w\ = r cos 9 cos o — r sm9 sm6 
>t?2 ^^sinScos^ + rcosSsinrfj 
and substituting 14 yields 

w\ = x cos9 — y smO 

>i>2 = x sm9 + y cos 9 

The equations in 16 are linear, so T is a linear operator; moreover, it follows from these equations that the standard matrix for T 
is 

cos# — sin0 
smO cos 9 



(16) 



[T] = 



EXAMPLE 5 Rotation 



If each vector in j? 2 is rotated through an angle of-/6( = 30°)> then the image tcofa vector 



x = 



X 

y 



IS 



w = 



costt/6 — sinW6 
smii / 6 cos 77 / 6 



^3/2 


-1/2 


"x" 




1/2 


/3/2 


7 





For example, the image of the vector 



x — 



IS w = 



2 
l + l/3 



£ 1 



A rotation of vectors in j? 3 is usually described in relation to a ray emanating from the origin, called the axis of rotation. As a 
vector revolves around the axis of rotation, it sweeps out some portion of a cone (Figure 4.2.5a). The angle of rotation, which is 
measured in the base of the cone, is described as "clockwise" or "counterclockwise" in relation to a viewpoint that is along the 
axis of rotation looking toward the origin. For example, in Figure 4.2.5a the vector w results from rotating the vector x 
counterclockwise around the axis / through an angle Q. As in R 2 , angles art positive if they are generated by counterclockwise 
rotations and negative if they are generated by clockwise rotations. 

(Axis uf rotation) 
I 




Uf) Angle of rotation 



Couiiicrcloekwisc 

rcrtqtkm 

i 




lh\ Right-hand rule 
Figure 4.2.5 

The most common way of describing a general axis of rotation is to specify a nonzero vector u that runs along the axis of 
rotation and has its initial point at the origin. The counterclockwise direction for a rotation about the axis can then be 
determined by a "right-hand rule" (Figure 4.2.5b): If the thumb of the right hand points in the direction of u, then the cupped 
fingers point in a counterclockwise direction. 

A rotation operator on j? 3 is a linear operator that rotates each vector in ^ 3 about some rotation axis through a fixed angle 0. In 
Table 7 we have described the rotation operators on J? 3 whose axes of rotation are the positive coordinate axes. For each of 
these rotations one of the components is unchanged by the rotation, and the relationships between the other components can be 
derived by the same procedure used to derive 16. For example, in the rotation about the z-axis, the z-components of x and 
w=T(x) are the same, and the x- and j-components are related as in 16. This yields the rotation equation shown in the last row 
of Table 7. 



Table 7 



Operator 



Illustration 



Equations 



Standard Matrix 



Counterclockwise rotation about the 
positive x-axis through an angle /) 



* 




w\ = x 

W2=y cos9 — zsinO 

W2=y sm.8+zcosd 



1 

cosO — smO 
smO cosO 



Counterclockwise rotation about the 
positive y-axis through an angle 




w\ =x cosd+zsmO 
w^= — x smO + zcosd 



cosO smO 

1 
— smO cosO 



Counterclockwise rotation about the 
positive z-axis through an angle 




w\ =x co$0 — y sinO 

W2=x sw.9 +y cos 9 
w^ =z 



cos8 — sm9 

sin^ cost? 

1 



Yaw, Pitch, and Roll 




Patch 



In aeronautics and astronautics, the orientation of an aircraft or space shuttle relative to an xyz-coordinate system is often 
described in terms of angles called yaw, pitch, and roll. If, for example, an aircraft is flying along the j-axis and the jry-plane 
defines the horizontal, then the aircraft's angle of rotation about the z-axis is called the yaw, its angle of rotation about the 
x-axis is called the pitch, and its angle of rotation about the _y-axis is called the roll. A combination of yaw, pitch, and roll can 
be achieved by a single rotation about some axis through the origin. This is, in fact, how a space shuttle makes attitude 
adjustments — it doesn't perform each rotation separately; it calculates one axis, and rotates about that axis to get the correct 
orientation. Such rotation maneuvers are used to align an antenna, point the nose toward a celestial object, or position a 
payload bay for docking. 



For completeness, we note that the standard matrix for a counterclockwise rotation through an angle about an axis in j? 3 , 
which is determined by an arbitrary unit vector w=(a,b,c) that has its initial point at the origin, is 

2 
a (1 — cosO) I cosO ab(\ — cosO) — c smO ac(\ — cos 0) I bsmO 

ab(l — cosO) I c smO b (1 — cos 3) -h cos9 be (I — cos 9) —a sm9 nj\ 

■ 2 

ac(\ — cos 0) —bsmO be (\ — cos 9) + a sm9 c (1 — cos 9) 4- cos 9 

The derivation can be found in the book Principles of Interactive Computer Graphics, by W. M. Newman and R. F. Sproull 
(New York: McGraw-Hill, 1979). The reader may find it instructive to derive the results in Table 7 as special cases of this more 
general result. 

Dilation and Contraction Operators 

If k is a nonnegative scalar, then the operator T(x) = kx on R 2 or ^ 3 is called a contraction with factor k if < k < 1 and a 
dilation with factor k if k > 1 . The geometric effect of a contraction is to compress each vector by a factor of k (Figure 4.2.6a), 
and the effect of a dilation is to stretch each vector by a factor of k (Figure 4.2.6b). A contraction compresses fi 2 or fi 3 uniformly 
toward the origin from all directions, and a dilation stretches p 2 or gl uniformly away from the origin in all directions. 



ti\) =Jtx 



ta) i)<k<[ 




{h) k>\ 
Figure 4.2.6 

The most extreme contraction occurs when fc = 0, in which case T(x) = kx reduces to the zero operator T(x) = 0, which 
compresses every vector into a single point (the origin). If fc = \ , then T(x) = kx reduces to the identity operator T(x) = x, 
which leaves each vector unchanged; this can be regarded as either a contraction or a dilation. Tables 8 and 9 list the dilation 
and contraction operators on fi 2 and fi 3 . 

Table 8 



Operator 



Illustration 



Equations Standard Matrix 



Operator 



Illustration 



Equations Standard Matrix 



Contraction with factor kon p 2 (0 < jfc < 1) 




Dilation with factor k on p 2 (fc > 1) 







W[ =fa 




W2=£y 






~k 




£ 



w 2 = ky 



Table 9 



Operator 



Illustration 



Equations Standard Matrix 



Contraction with factor kon ^ (0 < jfc < 1) 



Dilation with factor k on p 3 (£ > 1 ) 





^2 = ^7 

>^3 = kz 



~k 


0" 





k 





k 



Rotations in R 3 



Rotation looks 
counterclockwise 

» North Pole 



South E 3 ole 




0E 

]^ Rotation looks 
clockwise 



A familiar example of a rotation in p} is the rotation of the Earth about its axis through the North and South Poles. For 
simplicity, we will assume that the Earth is a sphere. Since the Sun rises in the east and sets in the west, we know that the 
Earth rotates from west to east. However, to an observer above the North Pole the rotation will appear counterclockwise, and 



to an observer below the South Pole it will appear clockwise. Thus, when a rotation in gl is described as clockwise or 
counterclockwise, a direction of view along the axis of rotation must also be stated. 

There are some other facts about the Earth's rotation that are useful for understanding general rotations in j? 3 . For example, as 
the Earth rotates about its axis, the North and South Poles remain fixed, as do all other points that lie on the axis of rotation. 
Thus, the axis of rotation can be thought of as the line of fixed points in the Earth's rotation. Moreover, all points on the 
Earth that are not on the axis of rotation move in circular paths that are centered on the axis and lie in planes that are 
perpendicular to the axis. For example, the points in the Equatorial Plane move within the Equatorial Plane in circles about 
the Earth's center. 

Compositions of Linear Transformations 

If Tj±: R n * R k an d Tg:R^ > R m are linear transformations, then for each x in R™ one can first compute T A (x), which is a 

vector in gh and then one can compute T B (T A (x)), which is a vector in R™. Thus, the application of T A followed by T B 
produces a transformation from R n toR m . This transformation is called the composition of Tg with T A and is denoted by 
T B o T A ( rea d "T B circle T A )- Thus 

(T B oT A )(x) = T B (T A (x)) (lg) 

The composition x B x A is linear since 

(T R aT A )(x)=Tx(T A (x))=B(Ax) = (BA)x (19) 

so T B a T A is multiplication by RA which is a linear transformation. Formula 19 also tells us that the standard matrix for 
T B a T A is BA This is expressed by the formula 



T B oT A =T BA 



(20) 



Remark Formula 20 captures an important idea: Multiplying matrices is equivalent to composing the corresponding linear 
transformations in the right-to-left order of the factors . 

There is an alternative form of Formula 20: If T\:R n > R k an d TV-fi* > R m are li near transformations, then because the 

standard matrix for the composition 7^ o T\ * s th e product of the standard matrices of 7^ an d T, we have 

[T 2 oT 1 ] = [T 2 ][T 1 ] (21) 



EXAMPLE 6 Composition of Two Rotations 



Let T\:R 2 > R 2 an d T2.R 2 * J? 2 be the linear operators that rotate vectors through the angles Q\ a ^d Q 2 , respectively. Thus 

the operation 

Cr 2 cT 1 )(x) = T 2 (T 1 (x)) 

first rotates x through the angle Q^ then rotates T\(x) through the angle 2 - It follows that the net effect of 7^ o T\ is to rotate 
each vector in j? 2 through the angle ^ + 2 (Figure 4.2.7). 



T^ixn ^ l^ 




Figure 4.2.7 



Thus the standard matrices for these linear operators are 

cos 0i — sin0j 



[7*1] = 



sin0i cos#i 
[7*2 7-1] = 



[7- 2 ] = 



cos 02 —sin 02 
sin 02 cos 02 



cos(0i+0 2 ) -sin(0i+0 2 ) 
sin(0i+0 2 ) cos(0i+0 2 ) 



These matrices should satisfy 21. With the help of some basic trigonometric identities, we can show that this is so as follows: 



[7*2] [7*1] = 



cos $2 —sin #2 
sin 02 cos 02 



cos0i — sin0i 
sin0i cos 0i 



cos 02 cos 0i — sin02 sin0i — (cos 02 sin0i + sin 02 cos0i 
sin02 cos 0i + cos 02 sin0i — sin02 sin0i + cos 02 cos0i 

"cos(0i+0 2 ) -sin(0i + 2 ) 
sin(0i+0 2 ) cos(0i+0 2 ) 

= [r 2 oTi] 



Remark In general, the order in which linear transformations are composed matters. This is to be expected, since the 
composition of two linear transformations corresponds to the multiplication of their standard matrices, and we know that the 
order in which matrices are multiplied makes a difference. 



EXAMPLE 7 Composition Is Not Commutative 



Let T\:R 2 * J? 2 be the reflection operator about the line y = x, and let 7% : R 2 * J? 2 be the orthogonal projection on the 

j-axis. Figure 4.2.8 illustrates graphically that X\ o Tj anc * ^2 ° T\ ^ ave different effects on a vector x. This same conclusion 
can be reached by showing that the standard matrices for 7^ and 7^ do not commute: 

[T l0 T 2 ] = [T { ][T 2 ] = 
[T 2 oT l ] = [T 2 ][T l ] = 



"0 f 


"0 0" 




"0 r 


1 0_ 


_0 1_ 




0_ 


"0 0" 


"0 r 




"0 0" 


_0 1_ 


1 0_ 




1 0_ 



so [T 2 °T l ]*[T lo T 2 ]- 



4 ■' 



TOTim**- 



: ix 



■ I \ I V = V 




(a) 7" 2 o 7"] 







r L (r,ix).) 



Figure 4.2.8 



EXAMPLE 8 Composition of Two Reflections 



Let T\:R 2 * i? 2 be the reflection about the y-axis, and let 7 2 :i? 2 * i? 2 be the reflection about the x-axis. In this case 

7 1 7 2 an d T 2 o Ti are the same; both map each vector x = (j ? j/) into its negative _ x = ( — x, — y) (Figure 4.2.9): 

(TiaT 2 )(x,y) = Ti(x, -y) = (- X , -y) 

(T 2 aTi)(x,y) = T2(-x,y) = (-x, -y) 



.-..si 




(-*>>) *-- 



($. -y) (-r + -y> 




(4 T|^7 2 



(/>) 7^2 °T, 



Figure 4.2.9 



The equality of X\ o 7*2 anc * ^2 ° ^1 can a ^ so ^ e deduced by showing that the standard matrices for 7^ and x 2 commute: 



-1 


1 




1_ 


_0 -1_ 




"1 0" 


"-10" 




-1_ 


1_ 





[Tl°T 2 ] = [T 1 ][T 2 ] = 

[T 2 oT { ] = [T 2 ][T { ] = 

The operator T(x) = — x on ^ 2 or ^ 3 is called the reflection about the origin. As the computations above show, the standard 
matrix for this operator on j? 2 is 

~-l 0" 



-1 








-1_ 


-1 


0" 





-1_ 



[T] = 



-1 



Compositions of Three or More Linear Transformations 

Compositions can be defined for three or more linear transformations. For example, consider the linear transformations 

Ty.R" >R k , T 2 :R k ► R 1 , T 3 :R ! > R™ 

We define the composition (7 3 o T 2 T\) :R n — > R™ by 

(T 30 7 2 o7 1 )(x) = 7 3 (T 2 (T 1 (x))) 

It can be shown that this composition is a linear transformation and that the standard matrix for 7 3 72 o *F\ * s re l ate d to the 
standard matrices for 7^, 7 2 , an d 73 by 



[7307207!] = [T 3 ][T 2 ][T 1 ] 



(22) 



which is a generalization of 21. If the standard matrices for 7^, 7^, and 7 3 are denoted by A, B, and C, respectively, then we also 
have the following generalization of 20: 



T c oT B oT A =T C BA 



(23) 



EXAMPLE 9 Composition of Three Transformations 



Find the standard matrix for the linear operator tr 3 ■, R 3 that first rotates a vector counterclockwise about the z-axis 

through an angle ff, then reflects the resulting vector about the yz-plane, and then projects that vector orthogonally onto the xy 
-plane. 



Solution 

The linear transformation T can be expressed as the composition 

7 = 73 o 72 o T\ 

where X\ is the rotation about the z-axis, 7^ * s the reflection about the yz-plane, and 7 3 is the orthogonal projection on the xy 
-plane. From Tables 3, 5, and 7, the standard matrices for these linear transformations are 



[7*1] = 



cos0 — sw.9 

sin0 cos0 

1 



[T 2 ] = 



1 


0" 





1 





1 



[T 3 ] = 



"1 





0" 





1 















Thus, from 22 the standard matrix for T is [T] = [ T3 ] [ T2 ] [ 7*i ] 5 that is, 



[T] = 



1 











1 











oj 



1 











1 











lj 



cos9 — sin0 

sintf cos 9 

1 



— cosO sm9 

sin 9 cosO 





Exercise Set 4.2 



© 



Click here for Just Ask! 



Find the domain and codomain of the transformation defined by the equations, and determine whether the transformation is 
!• linear. 



(a) Wl = 3*1 -2*2 i 4* 3 
^2 = 5*1 -8*2 I *3 



(b) W{ = 2*1*2- *2 

w 2 = *1 +3*1*2 
w 3 = *1 4- *2 

(c) W \ = 5*i — *2 I *3 
w 2 = -*1 + *2 + 7* 3 
w^ = 2*i —4*2 — *3 

^ ' >^i = * 1 — 3*2 I *3 — 2*4 

2 
w 2 — 3*1 — 4*2 — *3 I *4 

Find the standard matrix for the linear transformation defined by the equations. 



(a) w i = 2* i — 3*2 I *4 
^2 = 3*1 4- 5*2 -*4 



(b) Wl =7*1 + 2*2-8*3 
w 2 = - *2H 5*3 
>^3 =4*i -h 7*2 — *3 



(c) wi= -*i I * 2 
w 2 = 3j:i — 2^2 
w 3 = 5*i-7*2 



(d) wi=*i 

u>2 = *l + *2 

w 3 = *l + *2 + *3 

W4 = *l + *2 + *3 + *4 



Find the standard matrix for the linear operator XR > R given by 

3. 

w\ = 3*i 4- 5*2 — *3 

W2=^l - *2 I * 3 

w^ = 3*i -h 2*2 — *3 

and then calculate T( — 1, 2, 4) by directly substituting in the equations and also by matrix multiplication. 

Find the standard matrix for the linear operator T defined by the formula. 
4. 



(a) 7(*i,*2) = (2*i -*2,*i I * 2 ) 

(b) T(*i,*2) = (*i,*2) 

(c) T(*i,*2, *3) = Ol f- 2* 2 +*3-*l + 5*2 ? *3) 

(d) 7(*i ? *2 ? * 3 ) = C4*i ? 7*2 ? -8* 3 ) 

Find the standard matrix for the linear transformation T defined by the formula. 
5. 

(a) 7(*i,*2) = (*2, -*i-*i I 3*2 7 ^i-^2) 

(b) 7(*i ? *2 ? *3^4) = (7*1 I 2*2 -*3 I *4> *2 I *3, -*l) 

(c) 7(*i,*2,*3) = (0, 0, 0, 0, 0) 

(d) 7(*i ? *2 ? *3^4) = (>4,*1,*3, *2,*1 -*3) 



In each part, the standard matrix [T] of a linear transformation 7 is given. Use it to find T(*). [Express the answers in 
"• matrix form.] 



(a) 



[T] = 



1 2 
3 4 



x = 



3 
-2 



(b) 



[T] = 



1 2 
3 1 5 



; x = 



-1 
1 
3 



(c) 



[T] = 



2 1 


4" 




~*f 


3 5 


7 


; x = 


*2 


6 


-1 




*3 



(d) 



[T] = 



"-1 f 








*1 


2 4 


; x = 


X7 


7 8 


L J 



In each part, use the standard matrix for T to find 7*(x); then check the result by calculating T(x) directly. 



8. 



9. 



(a) T(x h x 2 ) = (-x { I x 2 ,x 2 ^x=(-l,4) 

(b) TCjci, jc 2 , jc 3 ) = (2jti — jt 2 I *3.*2 I *3, 0)Jx=(2, 1, -3) 

Use matrix multiplication to find the reflection of (-1, 2) about 

(a) the x-axis 

(b) thej-axis 

(c) the line y = x 

Use matrix multiplication to find the reflection of (2, -5, 3) about 

(a) the xy -plane 

(b) the ;t Z -plane 

(c) the^z-plane 



Use matrix multiplication to find the orthogonal projection of (2, -5) on 
10. 

(a) the x-axis 

(b) they-axis 

Use matrix multiplication to find the orthogonal projection of (-2, 1, 3) on 
11. 

(a) the xy -plane 

(b) the^ z -plane 

(c) the yz-plane 

Use matrix multiplication to find the image of the vector (3, -4) when it is rotated through an angle of 
12. 

(a) = 30° 

(b) 9= -60° 

(c) = 45° 

(d) = 90° 

Use matrix multiplication to find the image of the vector (-2, 1, 2) if it is rotated 
13. 

(a) 30° about the x-axis 

(b) 45° about the j-axis 

(c) 90° about the z-axis 

Find the standard matrix for the linear operator that rotates a vector in R 3 through an angle of — 6Q° about 
14. 

(a) the x-axis 

(b) thej-axis 



(c) thez-axis 

Use matrix multiplication to find the image of the vector (-2, 1, 2) if it is rotated 
15. 

(a) — 30° about the x-axis 

(b) — 45° about the j-axis 

(c) — 90° about the z-axis 

Find the standard matrix for the stated composition of linear operators on j? 2 . 
16. 

(a) A rotation of 90°, followed by a reflection about the line y = x. 

(b) An orthogonal projection on the j-axis, followed by a contraction with factor k = ^. 

(c) A reflection about the x-axis, followed by a dilation with factor k = 3- 

Find the standard matrix for the stated composition of linear operators on j? 2 . 
17. 

(a) A rotation of 60°, followed by an orthogonal projection on the x-axis, followed by a reflection about the line y = x. 

(b) A dilation with factor fc — 2, followed by a rotation of 45°, followed by a reflection about the j-axis. 

(c) A rotation of 15°, followed by a rotation of 105°, followed by a rotation of 60°. 

Find the standard matrix for the stated composition of linear operators on ^ 3 . 
18. 

(a) A reflection about the yz-plane, followed by an orthogonal projection on the ^-plane. 

(b) A rotation of 45° about the j-axis, followed by a dilation with factor k = J2. 

(c) An orthogonal projection on the ^-plane, followed by a reflection about the ^z-plane. 



Find the standard matrix for the stated composition of linear operators on j? 3 . 
19. 



(a) A rotation of 30° about the x-axis, followed by a rotation of 30° about the z-axis, followed by a contraction with 

factor i = -7- 

4 

(b) A reflection about the xy-plane, followed by a reflection about the ^ z -plane, followed by an orthogonal projection 
on the yz-plane. 

(c) A rotation of 270° about the x-axis, followed by a rotation of 90° about the y-axis, followed by a rotation of 180° 
about the z-axis. 



Determine whether X\ o 7*-? = 7*-? o 7*i • 
20. 

(a) Ty.R 2 > R 2 is the orthogonal projection on the x-axis, and T2.R 2 > R 2 * s the orthogonal projection on the 

y-axis. 

(b) T\ :R 2 > J? 2 is the rotation through an angle # 1? and T2'R 2 > R 2 * s the rotation through an angle Q 2 . 

(c) Ty.R 2 > R 2 is the orthogonal projection on the x-axis, and T2.R 2 > i? 2 * s the rotation through an angle {) . 

Determine whether 7^ 7^ = 7*2 ° ?Y 

(a) Ty.R? > j? 3 is a dilation by a factor /:, and T2.R? > i? 3 * s the rotation about the z-axis 

through an angle Q. 

(b) 7 1 : J? 3 » J? 3 is the rotation about the x-axis through an angle ff 1 , and 7 2 ■ p} > j? 3 is the rotation about the z-axis 

through an angle 2 - 



In pi the orthogonal projections on the x-axis, y-axis, and z-axis are defined by 
22 ' Ttfay.z) = (*, 0, 0), T 2 (x,y,z) = (0,y, 0), T^*,^) = (0, 0,z) 

respectively. 



(a) Show that the orthogonal projections on the coordinate axes are linear operators, and find their standard matrices. 

(b) Show that if 7* : R 3 » j? 3 is an orthogonal projection on one of the coordinate axes, then for every vector x in j? 3 

the vectors T(x) and x — 7*(x) are orthogonal vectors. 



(c) Make a sketch showing x and x — 7*(x) i n the case where 7 is the orthogonal projection on the x-axis. 



23. 



Derive the standard matrices for the rotations about the x-axis, y-axis, and z-axis in fi 3 from Formula 17. 



Use Formula 17 to find the standard matrix for a rotation of ^ / 2 radians about the axis determined by the vector 

24 ' Y=(l, 1, I)- 

Note Formula 17 requires that the vector defining the axis of rotation have length 1. 



25. 



Verify Formula 21 for the given linear transformations. 



(a) Ti (x u x 2 ) = (*i + x 2 , *i - *2) and T 2 (xu x 2 ) = (3x u 2x { + 4x 2 ) 



(b) Ti(xi,X2) = (4xi, -2*i | x 2 , -xi-3x2)a nd T2(xi,X2,x 3 ) = (xi, I 2* 2 -* 3 , 4*i -* 3 ) 



(c) TiC^i, ^2-^3) = C-^l- I *2- -*2 I *3- ~*3 I *l) and ^2^1-^2-^3) = (-2^1, 3x 3 , -4*2) 



,4 = 



It can be proved that if A is a 2 x 2 matrix with <Jet(j4) = 1 an d such that the column vectors of A are orthogonal and have 
26- length 1, then multiplication by A is a rotation through some angle ff. Verify that 

-1//2 -1//2 

1/^2 -1//2 

satisfies the stated conditions and find the angle of rotation. 

The result stated in Exercise 26 is also true in j? 3 : It can be proved that if A is a 3 x 3 matrix with det(j4) = 1 an d such that 
27# the column vectors of A are pairwise orthogonal and have length 1, then multiplication by A is a rotation about some axis of 
rotation through some angle {}. Use Formula 17 to show that if A satisfies the stated conditions, then the angle of rotation 
satisfies the equation 



cos 



c _ trQ4)-l 



Let A be a 3 x 3 matrix (other than the identity matrix) satisfying the conditions stated in Exercise 27. It can be shown the 
28. if x is any nonzero vector in gl, then the vector u = Ax + A x 4- [1— tr(j4) ] x determines an axis of rotation when u is 

positioned with its initial point at the origin. [See "The Axis of Rotation: Analysis, Algebra, Geometry," by Dan Kalman, 
Mathematics Magazine, Vol. 62, No. 4, October 1989.] 



(a) Show that multiplication by 



A = 



1 


4 


3 


9 


9 


9 


8 


4 


1 


9 


9 


9 


4 


7 


4 


9 


9 


9 



is a rotation. 



(b) Find a vector of length 1 that defines an axis for the rotation. 



(c) Use the result in Exercise 27 to find the angle of rotation about the axis obtained in part (b). 



Discussion 
Discovery 



29. 



In words, describe the geometric effect of multiplying a vector x by the matrix A. 



( a )^= 


2 









( b ^= 


2 
-2 



30. 



In words, describe the geometric effect of multiplying a vector x by the matrix A. 



(a) 



A = 



2 
3 




31. 



In words, describe the geometric effect of multiplying a vector x by the matrix 



A = 



cos 9 — sin 9 — 2 sin 9 cos 9 
2sinr9cosr9 cos 9 — sin 



If multiplication by A rotates a vector x in the xy -plane through an angle ft what is the effect of 
32. multiplying x by A T 7 Explain your reasoning. 

Let ifj be a nonzero column vector in R 2 , and suppose that T:R 2 > R 2 is the transformation 

33- defined by T(x) = xq + Rgx> where R is the standard matrix of the rotation of fi 2 about the origin 
through the angle (). Give a geometric description of this transformation. Is it a linear 
transformation? Explain. 



34. 



A function of the form f (x)=mx + b'^ commonly called a "linear function" because the graph 
of y = ffix I b is a line. Is /a linear transformation on Rl 



Let x = xq 4- tv be a line in R n , and let 7: £" — * R n be a linear operator on 5". What kind of 
35. geometric object is the image of this line under the operator 77 Explain your reasoning. 
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4.3 

PROPERTIES OF LINEAR 
TRANSFORMATIONS 
FROM R" TO R™ 



In this section we shall investigate the relationship between the invertibility of a 
matrix and properties of the corresponding matrix transformation. We shall also 
obtain a characterization of linear transformations from R n to R m that will form 
the basis for more general linear transformations to be discussed in subsequent 
sections, and we shall discuss some geometric properties of eigenvectors. 



One-to-One Linear Transformations 

Linear transformations that map distinct vectors (or points) into distinct vectors (or points) are of special importance. One example 

of such a transformation is the linear operator T\R 2 * R 2 that rotates each vector through an angle Q. It is obvious 

geometrically-that if u and v are distinct vectors in ^ 2 , then so are the rotated vectors T(\i) and T(v) (Figure 4.3.1). 




Figure 4.3.1 



Distinct vectors u and v are rotated into distinct vectors T(\i) and T(y) • 



In contrast, if 7; j? 3 * i? 3 is the orthogonal projection of J? 3 on the ^-plane, then distinct points on the same vertical line are 

mapped into the same point in the xy-plane (Figure 4.3.2). 



*" 



np 



*- 



*M 



Figure 4.3.2 



The distinct points P and Q are mapped into the same point M. 



DEFINITION 



A linear transformation T:R n 
(points) in R™. 



p™ is said to be one-to-one if T maps distinct vectors (points) in p n into distinct vectors 



Remark It follows from this definition that for each vector w in the range of a one-to-one linear transformation T, there is exactly 
one vector x such that T(x) = w- 



EXAMPLE 1 One-to-One Linear Transformations 



In the terminology of the preceding definition, the rotation operator of Figure 4.3.1 is one-to-one, but the orthogonal projection 
operator of Figure 4.3.2 is not. 

Let A be an n x n matrix, and let T^.R n — > i?" be multiplication by A. We shall now investigate relationships between the 
invertibility of A and properties of 7^. 

Recall from Theorem 2.3.6 (with w in place of b) that the following are equivalent: 
A is invertible. 

Ax = w is consistent for every ^ x 1 matrix w. 

* 
Ax = w has exactly one solution for every ^ x 1 matrix w. 

However, the last of these statements is actually stronger than necessary. One can show that the following are equivalent (Exercise 
24): 

A is invertible. 

Ax = w is consistent for every ^ x 1 matrix w. 

Ax = w has exactly one solution when the system is consistent. 

Translating these into the corresponding statements about the linear operator 7^, we deduce that the following are equivalent: 

* 
A is invertible. 

m 

For every vector w in R n , there is some vector x in R n such that Tj[(x) = w- Stated another way, the range of 7^ is all of R n . 

* 
For every vector w in the range of 7^, there is exactly one vector x in R n such that Tj\(x) = w- Stated another way, 7^ is 
one-to-one. 

In summary, we have established the following theorem about linear operators on R n . 
THEOREM 4.3.1 



Equivalent Statements 

If A is an nxn matrix and Tj±.R n * R n is multiplication by A, then the following statements are equivalent. 



(a) A is invertible. 



(b) The range off^ is R* 



(c) Xj\ is one-to-one. 



EXAMPLE 2 Applying Theorem 4.3.1 



In Example 1 we observed that the rotation operator T\R? * R 2 illustrated in Figure 4.3.1 is one-to-one. It follows from 

Theorem 4.3.1 that the range of T must be all of R 2 and that the standard matrix for T must be invertible. To show that the range of 
Tis all of R 2 , we must show that every vector w in R 2 is the image of some vector x under T. But this is clearly so, since the vector 
x obtained by rotating w through the angle — maps into w when rotated through the angle 0. Moreover, from Table 6 of Section 
4.2, the standard matrix for Tis 

cos 9 — sm.9 
sm9 cos 9 



[T] = 



which is invertible, since 



det[T] = 



cos 9 — sm.9 
smO cos 9 



= cos 2 9 -\- sm 2 9 = 1 *0 



EXAMPLE 3 Applying Theorem 4.3.1 



In Example 1 we observed that the projection operator T\R^ > R^ illustrated in Figure 4.3.2 is not one-to-one. It follows from 

Theorem 4.3.1 that the range of Tis not all of R^ and that the standard matrix for Tis not invertible. To show directly that the 
range of T is not all of r}, we must find a vector w in r} that is not the image of any vector x under T. But any vector w outside of 
the xy-plane has this property, since all images under The in the xy-plane. Moreover, from Table 5 of Section 4.2, the standard 
matrix for T is 



[T] = 



"1 





0" 





1 















which is not invertible, since det[Tl = - 



Inverse of a One-to-One Linear Operator 



If Tj±.R n — * R n is a one-to-one linear operator, then from Theorem 4.3.1 the matrix A is invertible. Thus, T \:R n — * R n is 
itself a linear operator; it is called the inverse of 7^. The linear operators 7^ and ^-1 cancel the effect of one another in the sense 
that for allx in R n \, 

T A (T A _ 1 (x))=AA- 1 x = Ix = x 



T A _ 1 (T A (x))=A- 1 Ax = Ix = x 



or, equivalently, 



T A-^ T A= T A-'A= T 1 



From a more geometric viewpoint, if w is the image of x under 7*^, then ^-1 maps w back into x, since 
(Figure 4.3.3). 



T A . l (w) = T A . l (T A (x))=x 



+ >" 



■,\ m « 




/ . eisa^s 



Figure 4.3.3 

Before turning to an example, it will be helpful to touch on a notational matter. When a one-to-one linear operator on R n is written 

as T\BP > R n (rather than T^.R™ — * i?"X then the inverse of the operator Tis denoted by 7 -1 (rather than ^-0- Since the 

standard matrix for 7 -1 is the inverse of the standard matrix for T, we have 



[7- 1 ] = [T]- 1 



(1) 



EXAMPLE 4 Standard Matrix for T" 1 



Let T:R 2 * R 2 be the operator that rotates each vector in R 2 through the angle 0, so from Table 6 of Section 4.2, 



[T] = 



cos9 — sm.9 
smO cosO 



(2) 



It is evident geometrically that to undo the effect of T, one must rotate each vector in R 2 through the angle — 0. But this is exactly 
what the operator 7 -1 does, since the standard matrix for 7 -1 is 



[T~ 1 ] = [T]- 1 = 



cos 9 sm9 
-sm.9 cos9 



cos(-O) -sm(-0) 
sin(-0) cos(-O) 



(verify), which is identical to 2 except that is replaced by — Q. 



EXAMPLE 5 Finding T" 1 



Show that the linear operator T:R 2 * R 2 defined by the equations 

w\ = 2x\ + *2 
W2 = 3j:i +4^2 

is one-to-one, and find 7 _1 (^ W2 ). 



Solution 

The matrix form of these equations is 



v?2 



2 1 

3 4 



*1 
*2 



so the standard matrix for T is 



[T] = 



2 1 

3 4 



This matrix is invertible (so Tis one-to-one) and the standard matrix for x~ l is 



[T- 1 ] = [T]- 1 = 



4 _1 

5 5 



Thus 



[7- 1 ] = 



y?2 



4 


r 






5 


5 


W\ 




3 


2 


w 2 




5 


5 







-W1--W2 



4. 
5 

3 '? 

-W1 + -W2 



from which we conclude that 



T l (wi,w 2 ) = (-^wi--w 2 , --^wi + -w 2 ) 



Linearity Properties 

In the preceding section we defined a transformation T:R" ► R™ to be linear if the equations relating x and w = T(x) are linear 

equations. The following theorem provides an alternative characterization of linearity. This theorem is fundamental and will be the 
basis for extending the concept of a linear transformation to more general settings later in this text. 



THEOREM 4.3.2 



Properties of Linear Transformations 

A transformation T:R n — > R m is linear if and only if the following relationships hold for all vectors u and v in R n and for 
every scalar c. 



(a) T(u I v) = 7(u) I 7(v) 



(b) T(c\i)=cT(\i) 



Proof Assume first that T is a linear transformation, and let A be the standard matrix for T. It follows from the basic arithmetic 
properties of matrices that 



7(u + v) = A(n + v) = An 4- ^v = 7(u) + T(v) 



and 



T(cu) =A(cu) =c(Au) =cT(u) 

Conversely, assume that properties (a) and (b) hold for the transformation T. We can prove that 7" is 
linear by finding a matrix A with the property that 



T(x)=Ax 



(3) 



for all vectors x in r™\. This will show that T is multiplication by A and therefore linear. But before we 
can produce this matrix, we need to observe that property (a) can be extended to three or more terms; 
for example, if u, v, and w are any vectors in R n , then by first grouping v and w and applying property 
(a), we obtain 

7( u + v + w) = 7(u + (v + w)) = 7(u) + 7(v + w) = T(u) -h T(v) 4 T(w) 

More generally, for any vectors vi, v 2 , ..., vjt in £", we have 

T(vi I v 2 f- + vjt) = r(vi) I r(v 2 )+™+7Xv fc ) 
Now, to find the matrix A, let ei, e 2 , ■■■, <? H be the vectors 



ei = 



Y 




Y 







1 





. 62 = 














*M = 



(4) 



and let/4 be the matrix whose successive column vectors are T(e\), T(e 2 )' ■■■/ T(e H ); that is, 

^=[T( ei )|7( e2 )|-|T( eH )] 



(5) 



If 



x = 



*1 
*2 



is any vector in R", then as discussed in Section 1.3, the product Ax is a linear combination of the 
column vectors of A with coefficients from x, so 

Ax = x 1 T( ei )+x 2 T(e2)+" + x„T(en) 

= 7*0 iei) + 7*0282) + ■" + 7 , H e H ) «- Property (£) 

= TOl^l + *2 e 2 H — + *m*h) *— Property (b) for « terms 

= T(x) 

which completes the proof. 



Expression 5 is important in its own right, since it provides an explicit formula for the standard matrix of a linear operator 
T:R n — * R m in terms of the images of the vectors ei, e 2 , ..., e H under T. For reasons that will be discussed later, the vectors e^ 
e 2 , . . ., e H in 4 are called the standard basis vectors for R n . In R 2 and R^ these are the vectors of length 1 along the coordinate axes 
(Figure 4.3.4). 



* y 



A 



(0. t> 



£ >« ^ 

(a) Standard basis for/? 2 



A 



(0.0, I » 






y*\i t u.o) 



(0, 1,0) 



(6) Standard basLs for/?* 
Figure 4.3.4 

Because of its importance, we shall state 5 as a theorem for future reference. 



THEOREM 4.3.3 





IfT:R n - 


— * R m is a linear transformation, and ei, ?2> •••> *h ar ^ ^ standard basis vectors for R n , then the standard matrix 


for T is 


[T] = [T(* l )\T(* 2 )\-\T{e ri )] (6) 



Formula 6 is a powerful tool for finding standard matrices and analyzing the geometric effect of a linear transformation. For 

example, suppose that T.B? > i? 3 is the orthogonal projection on the ;cy-plane. Referring to Figure 4.3.4, it is evident 

geometrically that 



7(<?i) = ei = 



. T(*2) = *2 = 



7(e 3 ) =0 = 



so by 6, 



[T] = 



"1 





0" 





1 















which agrees with the result in Table 5 of Section 4.2. 



Using 6 another way, suppose that Tj(.F? > R 2 i s multiplication by 

i 2 r 

3 6_ 
The images of the standard basis vectors can be read directly from the columns of the matrix A: 



A = 



T A 



\'Y 


\ 


I 





^ 


1 


n 


\ _, 






- 1 


j 






V 






I 


i 





= 


1 


' Ta \ 


l 


= 


n 


. '^ 





= 


fi 


ll- u 


/ 




{ 


u 


/ 




I 


i 


J 





EXAMPLE 6 Standard Matrix for a Projection Operator 



Let / be the line in the xy-plane that passes through the origin and makes an angle Q with the positive x-axis, where < 9 < n. As 
illustrated in Figure 4.3.5a, let T:F? > R 2 be a linear operator that maps each vector into its orthogonal projection on /. 




UO 




W 



Figure 4.3.5 



(a) Find the standard matrix for T. 



(b) Find the orthogonal projection of the vector x = ( 1 , 5) onto the line through the origin that makes an angle of Q = ^ / g with 
the positive x-axis. 



Solution (a) 



From 6, 

[T] = [T(ei)|7( e2 )] 

where e j and ?2 are the standard basis vectors for R 2 . We consider the case where < 6 < tc / 2; the case where % / 2 < < jt is 
similar. Referring to Figure 4.3.5b, we have ||T(ei) || = cos 0, so 

||T(ei)||cosfil" 
||r(ei)||siii0 



T( ei ) = 



cos 2 
sin0cos0 



and referring to Figure 4.3.5c, we have ||r(e2) || = sin ft so 

||7Xe 2 )||cos0' 



T(m 2 ) = 



\\T^ 2 )\\sme 



sin0cos0 
sin 2 



Thus the standard matrix for T is 



[T] = 



2 
cos sin cos 

sin cos sin 



Solution (b) 

Since sin ?r / 6 = 1 / 2 and cos ir / 6 = i/3 / 2> i* follows from part (a) that the standard matrix for this projection operator is 

3/4 ^3/4 



Thus 



or, in point notation, 



[T] = 



/3/4 1/4 



3/4 |rT/4 


"l" 




^3/4 1/4 


_5_ 





3 I 5^3 


4 
l/3 I 5 


4 



7X1,5) = 



f 34 5^3 !/3 I 5 

I 4 ' 4 



Geometric Interpretation of Eigenvectors 

Recall from Section 2.3 that if A is an H x « matrix, then \ is called an eigenvalue of A if there is a nonzero vector x such that 

Ak = Ax or, equivalency, (M — A)x = 

The nonzero vectors x satisfying this equation are called the eigenvectors of A corresponding to A- 

Eigenvalues and eigenvectors can also be defined for linear operators on R n ; the definitions parallel those for matrices. 



DEFINITION 



If T:R n — * R" is a linear operator, then a scalar \ is called an eigenvalue of T if there is a nonzero x in R" such that 

TOe) = Ax 

Those nonzero vectors x that satisfy this equation are called the eigenvectors of T corresponding to \. 



(V) 



Observe that if A is the standard matrix for T, then 7 can be written as 
from which it follows that 

The eigenvalues of T are precisely the eigenvalues of its standard matrix A. 

■* 
x is an eigenvector of T corresponding to A if and only if x is an eigenvector of A corresponding to A- 



If A is an eigenvalue of A and x is a corresponding eigenvector, then j\x = Ax, so multiplication by A maps x into a scalar multiple 
of itself. In p^ and ^ 3 , this means that multiplication by A maps each eigenvector x into a vector that lies on the same line as x 
(Figure 4.3.6). 



Ax = A\ 



ia) A>0 



As. - A x 



ib) A <0 
Figure 4.3.6 



Recall from Section 4.2 that if A > 0, then the linear operator j±x = Ax compresses x by a factor of A if < A < 1 or stretches x by a 
factor of A if A > 1. If A < 0, then Ax = Ax reverses the direction of x and compresses the reversed vector by a factor of |A| if 
< |A| < 1 or stretches the reversed vector by a factor of |A| if |A| > 1 (Figure 4.3.7). 




Figure 4.3.7 




(/?> A > i 





EXAMPLE 7 Eigenvalues of a Linear Operator 



Let T:R 2 > R 2 be the linear operator that rotates each vector through an angle {}. It is evident geometrically that unless is a 



multiple of jr, T does not map any nonzero vector x onto the same line as x\ consequently, T has no real eigenvalues. But if is a 
multiple of jr, then every nonzero vector x is mapped onto the same line as x, so every nonzero vector is an eigenvector of T. Let us 
verify these geometric observations algebraically. The standard matrix for T is 

cos 8 — sm.8 
sm.8 cos 8 



A = 



As discussed in Section 2.3, the eigenvalues of this matrix are the solutions of the characteristic equation 

A — cos 8 sm.8 



det(A/- J 4) = 



— smO X — cosO 



= 



that is, 



(A-cos0) 2 + sin 2 0=O 



(8) 



But if is not a multiple of n, then sm 2 g > o, so this equation has no real solution for A, and consequently A has no real 
eigenvalues.* If is a multiple of -, then S m0 — and either cos0=lor CO s0= — 1, depending on the particular multiple of 77. In 

■Hi 

the case where S m0 == and C os 8 = h the characteristic equation 8 becomes (A — 1) = 0, so A = 1 is the only eigenvalue of A. In 
this case the matrix A is 



,4 = 



1 
1 



= / 



Thus, for allx in j? 2 , 

T(x)=Ax = Ix = x 

so Tmaps every vector to itself, and hence to the same line. In the case where S m0 = and C os 9 = — 1, the characteristic equation 
8 becomes (A h l) 2 = 0, so A = — 1 is the only eigenvalue of A. In this case the matrix A is 



,4 = 



-1 
-1 






Thus, for allx in ^ 

T(x)=Ax= -Ix= -x 
so Tmaps every vector to its negative, and hence to the same line as x. 



EXAMPLE 8 Eigenvalues of a Linear Operator 



Let 7;j? 3 * j? 3 be the orthogonal projection on the xy-plane. Vectors in the xy-plane are mapped into themselves under T, so 

each nonzero vector in the xy-plane is an eigenvector corresponding to the eigenvalue X = 1- Every vector x along the z-axis is 
mapped into under T, which is on the same line as x, so every nonzero vector on the z-axis is an eigenvector corresponding to the 
eigenvalue A = 0- Vectors that are not in the ^y-plane or along the z-axis are not mapped into scalar multiples of themselves, so 
there are no other eigenvectors or eigenvalues. 

To verify these geometric observations algebraically, recall from Table 5 of Section 4.2 that the standard matrix for Tis 



"1 





0" 





1 















The characteristic equation of A is 



dtt(M-A) = 



A = 



A- 1 

A-l 
A 



= or (A-irA = 



which has the solutions A = and A = 1 anticipated above. 

As discussed in Section 2.3, the eigenvectors of the matrix A corresponding to an eigenvalue A are the nonzero solutions of 



A-l 

A-l 
A 



r* 1 " 




"0" 


*2 


= 





|/3 








(9) 



If A = 0. this system is 



-1 


0" 


"*l" 




"0" 





-1 


*2 


= 











*3 








which has the solutions K \ = 0i X2 = 0> xi = £ (verify), or, in matrix form, 



~*l" 




"0" 


*2 


= 





*3 




t 



As anticipated, these are the vectors along the z-axis. If \ = 1, then system 9 is 



"0 0" 


"*l" 




"0" 





*2 


= 





1 


*3 








which has the solutions x\ = s, ^ 2 = U x% = (verify), or, in matrix form, 



~*l" 




~s~ 


*2 


= 


t 


*3 








As anticipated, these are the vectors in the xy-plane. 



Summary 



In Theorem 2.3.6 we listed six results that are equivalent to the invertibility of a matrix A. We conclude this section by merging 
Theorem 4.3.1 with that list to produce the following theorem that relates all of the major topics we have studied thus far. 



THEOREM 4.3.4 



Equivalent Statements 

If A is an HXtt matrix, and ifT^.PJ 1 > R n is multiplication by A, then the following are equivalent. 

(a) A is invertible. 

(b) Ax = has only the trivial solution. 

(c) The reduced row-echelon form of A is J . 



(d) A is expressible as a product of elementary matrices. 



(e) Ax = h is consistent for every BX 1 matrix b. 



(f) Ax = h has exactly one solution for every MX 1 matrix b. 



(g) det(,4)*0. 

(h) The range ofT^ is R } 

(i) T_a is one-to-one. 



Exercise Set 4.3 



Click here for Just Ask! 



By inspection, determine whether the linear operator is one-to-one. 
1. 



(a) the orthogonal projection on the x-axis in $} 

(b) the reflection about the y-axis in p^ 

(c) the reflection about the line y = x in ^ 

(d) a contraction with factor fc > in R 2 

(e) a rotation about the z-axis in j? 3 

(f) a reflection about the ^y-plane in ^ 3 

(g) a dilation with factor fc > in J? 3 



Find the standard matrix for the linear operator defined by the equations, and use Theorem 4.3.4 to determine whether the 
2. operator is one-to-one. 



(a) W{ = Sx\ +4*2 
^2 = 2*1+ *2 

(b) >^i = 2*i — 3*2 
^2 = 5*1+ *2 



(c) w\ = — x\ 4- 3*2 4- 2*3 
>^2 = 2*i | 4*3 
w 3 — *1 + 3*2 I 6*3 

(d) w\= *i+2*2 + 3*3 
>^2 = 2*i 4- 5*2 4- 3*3 
w 3 — *1 + S*3 



Show that the range of the linear operator defined by the equations 

w\ =4*i -2*2 

^2 = 2*1 - *2 
is not all of ^ 2 , and find a vector that is not in the range. 

Show that the range of the linear operator defined by the equations 

w\= *i — 2*2 4- *3 
w 2 = 5*i -*2 4- 3*3 
w^ = 4*i +*2 I 2*^ 
is not all of j? 3 , and find a vector that is not in the range. 



Determine whether the linear operator XR 2 * R 2 defined by the equations is one-to-one; if so, find the standard matrix for 

*• the inverse operator, and find 7 -1 ( w ^ ? W2 y 



(a) Wl — X \ _| 2*2 

W? = — *1 I *■? 

(b) Wl = 4*i -6*2 
^2= -2*i I 3*2 

(c) wi= -*2 
w 2 = — Jri 

(d) Wl= 3*1 
u?2 = -5*i 

Determine whether the linear operator T:R? > i? 3 defined by the equations is one-to-one; if so, find the standard matrix for 

"* the inverse operator, and find r _1 (>^i ? >v2,>^3)* 



(a) w\= *i — 2*2 I- 2*3 
W2 = 2*i + *2 +*3 
wi= *i 4- *2 



(b) w\= x\ — 3^2 4- 4*3 
™2 = ~*1 4 *2+ *3 
w% = — 2*2 + 5*3 

(c) Wl = x\ 4 4^2 —^3 
^2 = 2*1 +7^2+^3 
u?3 = xi + 3^2 

(d) ^ = *! 4- 2^2 4- ^3 
>t?2 = — 2^i 4 *2 4 4^3 
w^= 7x\ 4 4*2 — 5*3 



By inspection, determine the inverse of the given one-to-one linear operator. 
7. 



(a) the reflection about the x-axis in R 2 

(b) the rotation through an angle of n / 4 in R 2 

(c) the dilation by a factor of 3 in j? 2 

(d) the reflection about the yz-plane in j? 3 



( e ) the contraction by a factor of -^ in j? 3 
In Exercises 8 and 9 use Theorem 4.3.2 to determine whether T.R 2 > i? 2 is a linear operator. 



8. 

(a) T{x,y) = (2x,y) 



(b) rfr^C* 2 ^) 

(c) T(x,y) = (-y,x) 

(d) r(*,jO = (*,0) 



9. 

(a) T(x,y) = (2x+y,x-y) 



(b) T(x,y) = (x+\,y) 

(c) T(x,y) = (y,y) 

(d) T(x,y) = ci[x~, 3 {y~) 

In Exercises 10 and 1 1 use Theorem 4.3.2 to determine whether T:R^ ► R 2 is a linear transformation. 



10. 

(a) T(x,y,z) = ( k x,x+y+z) 



(b) T(x,y,z) = (l,\) 



11. 

(a) r(*„y,z) = (0,0) 



(b) r(*,,y,z) = (3*-4;y,2*-5z) 



In each part, use Theorem 4.3.3 to find the standard matrix for the linear operator from the images of the standard basis 
12. vectors. 



(a) the reflection operators on R 2 in Table 2 of Section 4.2 

(b) the reflection operators on ^ 3 in Table 3 of Section 4.2 

(c) the projection operators on R 2 in Table 4 of Section 4.2 

(d) the projection operators on ^ 3 in Table 5 of Section 4.2 

(e) the rotation operators on ^ in Table 6 of Section 4.2 

(f) the dilation and contraction operators on ^ in Table 9 of Section 4.2 

Use Theorem 4.3.3 to find the standard matrix for T:R? ► R? f rom the images of the standard basis vectors. 



13. 



(a) T:R 2 ► R 2 projects a vector orthogonally onto the x-axis and then reflects that vector about the j-axis. 



(b) T:R 2 > R 2 reflects a vector about the line y = x and then reflects that vector about the x-axis. 



(c) YR 2 * R 2 dilates a vector by a factor of 3, then reflects that vector about the line y = x, and then projects that 

vector orthogonally onto the y-axis. 



14. 



Use Theorem 4.3.3 to find the standard matrix for T:R? > R? from the images of the standard basis vectors. 



( a ) XR 3 * R^ reflects a vector about the ^-plane and then contracts that vector by a factor of -^ 

(b) T:R^ * i? 3 projects a vector orthogonally onto the ^z-plane and then projects that vector orthogonally onto the xy 

-plane. 

(c) T\R^ ► i? 3 reflects a vector about the xy-plane, then reflects that vector about the ^-plane, and then reflects that 

vector about the y^-plane. 



15. 



Let T A :R 3 > i? 3 be multiplication by 



,4 = 



-13 
2 1 2 
4 5-3 



and let ei, 62, and e^ be the standard basis vectors for j? 3 . Find the following vectors by inspection. 



(a) T A (e { ), T A (e 2 )> and T A (e 3 ) 



(b) T^ei + e 2 I e 3 ) 

(c) T a (7b 3 ) 



16. 



Determine whether multiplication by A is a one-to-one linear transformation. 



(a) 



A = 



<®A = 



1 -1 

2 

3 -4 




1 2 
-1 


3 
-4 



(c) 



1 2 


1 


1 


1 


1 1 





1 


-1 



Use the result in Example 6 to find the orthogonal projection of x onto the line through the origin that makes an angle with 
17. the positive x-axis. 



(a) x=(-l,2);tf = 45 c 



(b) x= (1,0)5 = 30° 

(c) x=(l,5);0=12O c 



Use the type of argument given in Example 8 to find the eigenvalues and corresponding eigenvectors of T. Check your 
18. conclusions by calculating the eigenvalues and corresponding eigenvectors from the standard matrix for T. 



(a) T:R 2 * R 2 is the reflection about the x-axis. 



(b) T:R 2 * J? 2 is the reflection about the line y =x\. 



(c) T:R 2 * R 2 is the orthogonal projection on the x-axis. 



(d) X:R 2 ► R 2 is the contraction by a factor of -^-. 



Follow the directions of Exercise 18. 



19. 



(a) T:R? ► j? 3 is the reflection about the yz-plane. 



(b) T:R? ► R? is the orthogonal projection on the ^z-plane. 



(c) T:R^ ► J? 3 is the dilation by a factor of 2. 



(d) T:R^ ► j? 3 is a rotation of ^ / 4 about the z-axis. 



20. 



(a) Is a composition of one-to-one linear transformations one-to-one? Justify your conclusion. 



(b) Can the composition of a one-to-one linear transformation and a linear transformation that is not one-to-one be 
one-to-one? Account for both possible orders of composition and justify your conclusion. 



21. 



Show that T{x, y) = (0, 0) defines a linear operator on pi but T(x, y) = (1, 1) does not. 



22. 



(a) Prove that if T:R n — > R™ is a linear transformation, then 7(0) = — that is, Tmaps the zero vector in R n into the 
zero vector in R m . 



(b) The converse of this is not true. Find an example of a function that satisfies T(0) = but is not a linear transformation. 



Let / be the line in the ^y-plane that passes through the origin and makes an angle ft with the positive x-axis, where < 9 < ir. 
23. Let T:R? > R? be the linear operator that reflects each vector about / (see the accompanying figure). 



(a) Use the method of Example 6 to find the standard matrix for T. 



(b) Find the reflection of the vector x = ( 1 , 5) about the line / through the origin that makes an angle of Q = 30 c with the 
positive x-axis. 




Figure Ex-23 

Prove: An n x n matrix A is invertible if and only if the linear system As. = w has exactly one solution for every vector w in R n 
24. for which the system is consistent. 

Discussion 

Discovery Indicate whether each statement is always true or sometimes false. Justify your answer by giving a 



25. logical argument or a counterexample. 



(a) If T maps R n into R m , and 7(0) = 0, then Tis linear. 



(b) If T:R n — * R m is a one-to-one linear transformation, then there are no distinct vectors u 
and v in R n such that T(\\ — v) = 0- 



(c) If T: R n — 3h R n is a linear operator, and if 7*( x ) = 2x for some vector x, then A = 2 is an 
eigenvalue of T. 



(d) If T maps R n into £ m , and if T(c\u I ^2v) = c\T(n) I C2?*(v) for a11 scalars ^ and c 2 and 
for all vectors u and v in ,£", then Tis linear. 



Indicate whether each statement is always true, sometimes true, or always false. 
26. 



(a) If T\R n — * R m is a linear transformation and m > n , then Tis one-to-one. 

(b) If T:R n — > R m is a linear transformation and m<^ then Tis one-to-one. 

(c) If T:R n — > R m is a linear transformation and m = n^ then Tis one-to-one. 

Let A be an n x # matrix such that det(j4) = 0. and let T \S" — ► R n be multiplication by A. 
27. 



(a) What can you say about the range of the linear operator 77 Give an example that illustrates 
your conclusion. 



(b) What can you say about the number of vectors that Tmaps into 0? 



In each part, make a conjecture about the eigenvectors and eigenvalues of the matrix A 
28. corresponding to the given transformation by considering the geometric properties of multiplication 
by A. Confirm each of your conjectures with computations. 



(a) Reflection about the line y = c - 

(b) Contraction by a factor of 7-. 



2 
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4.4 

LINEAR 

TRANSFORMATIONS 
AND POLYNOMIALS 



In this section we shall apply our new knowledge of linear transformations to 
polynomials. This is the beginning of a general strategy of using our ideas about 
R n to solve problems that are in different, yet somehow analogous, settings. 



Polynomials and Vectors 

Suppose that we have a polynomial function, say 

2 
p(x) = ax -\-bx -\-c 

where x is a real- valued variable. To form the related function 2p(x) we multiply each of its coefficients by 2: 

2p(x) = 2ax -h 2bx I 2c 

That is, if the coefficients of the polynomial p{x) are a, b, c in descending order of the power of x with which they are associated, 
then 2p(x) is also a polynomial, and its coefficients are 2a, 2b, 2c in the same order. 

Similarly, ifq(x)=dx I ex + f is another polynomial function, then ^(x) | q(x) is also a polynomial, and its coefficients are 
a \-d,b \-c,c I /. We add polynomials by adding corresponding coefficients. 

This suggests that associating a polynomial with the vector consisting of its coefficients may be useful. 



EXAMPLE 1 Correspondence between Polynomials and Vectors 



Consider the quadratic function p(x) = ax + bx + c. Define the vector 



z = 



consisting of the coefficients of this polynomial in descending order of the corresponding power of x. Then multiplication of p(x) 
by a scalar s gives sp(x) = sax I sbx + sc, and this corresponds exactly to the scalar multiple 



SL = 



sa 

sb 
sc 



of z. Similarly, p(x) \ p(x) is 2ax 2 I 2bx I 2c, and this corresponds exactly to the vector sum z | z- 





'a" 




~a~ 


z + z = 


b 


+ 


b 




c 




c 




2a 




= 


2b 






2c 





In general, given a polynomial p ( x ) = a ^ + fl H _ix" _1 + - + a\x + a^ we associate with it the vector 



z = 



ai 
fl 



in i? H+1 (Figure 4.4.1). It is then possible to view operations like p{ x ) — > 2p(x) as being equivalent to a linear transformation on 
^ H+1 , namely T(z) = 2z- We can perform the desired operations in J? H+1 rather than on the polynomials themselves. 



/*» 



Figure 4.4.1 



The vector z is associated with the polynomial p. 



EXAMPLE 2 Addition of Polynomials by Adding Vectors 



Lztp(x)=Ax — 2x I 1 and^(x) = 3x — 3x I x. Then to compute r (x) = 4p (x) — 2q (x), we could define 



u = 



A 




3" 



-2 


, v = 


-3 
1 


1 








and perforai the corresponding operation on these vectors: 



4u-2v = 4 



4" 




3" 



-2 


-2 


-3 
1 


1 








10 

6 

-10 

4 



Hence r(x) = \ Ox 3 I 6* 2 -10* I 4. 



This association between polynomials of degree n and vectors in j ^"+ 1 would be useful for someone writing a computer program 
to perform polynomial computations, as in a computer algebra system. The coefficients of polynomial functions could be stored as 
vectors, and computations could be performed on these vectors. 

For convenience, we define p n to be the set of all polynomials of degree at most n (including the zero polynomial, all the 
coefficients of which are zero). This is also called the space of polynomials of degree at most n. The use of the word space 
indicates that this set has some sort of structure to it. The structure of p n will be explored in Chapter 8. 



EXAMPLE 3 Differentiation of Polynomials 



Calculus Required 



Differentiation takes polynomials of degree n to polynomials of degree fl _ 1, so the corresponding transformation on vectors must 
take vectors in ^"+1 to vectors in R n . Hence, if differentiation corresponds to a linear transformation, it must be represented by a 
« x (m 4- 1) matrix. For example, if p is an element of p 2 — that is, 



for some real numbers a, b, and c — then 



p(x) = ax -\-bx -\-c 



ax 



Evidently, if p(x) in P 2 corresponds to the vector ( a? b, c) in R^, then its derivative is in j p 1 and corresponds to the vector (2a, b) 
in j^2. Note that 



2a 
b 



2 
1 



The operation differentiation, p)- p 2 * P^ corresponds to a linear transformation p A - i? 3 > R 2 > where 

"2 0" 



,4 = 



1 



Some transformations from p^ to p m do not correspond to linear transformations from i? H+1 to i? m+1 . For example, if we consider 
the transformation of a x 2 I bx I c in P 2 to \a\ in p^, the space of all constants (viewed as polynomials of degree zero, plus the 
zero polynomial), then we find that there is no matrix that maps ( a? b, c) in R? to \a\ in R. Other transformations may correspond to 
transformations that are not quite linear, in the following sense. 



DEFINITION 



An affine transformation from R n to R m is a mapping of the form S(u) = T(\i) I f , where Tis a linear transformation from 
R n to R™ and/ is a (constant) vector in R m . 



The affine transformation S is a linear transformation iff is the zero vector. Otherwise, it isn't linear, because it doesn't satisfy 
Theorem 4.3.2. This may seem surprising because the form of S looks like a natural generalization of an equation describing a line, 
but linear transformations satisfy the Principle of Superposition 

7(ciu + C2v) =c\T(u) I c 2 T(v) 

for any scalars c\, c 2 and any vectors w, v in their domain. (This is just a restatement of Theorem 4.3.2.) Affine transformations 
with/ nonzero don't have this property. 



EXAMPLE 4 Affine Transformations 



The mapping 



S(m) = 



f 


11 + 


"1" 


[-1 oj 




[lj 



is an affine transformation on R 2 . If u — ( a? £), then 



S(xx) = 



1 
-1 

b + 1 



The corresponding operation from J P 1 to j p 1 takes ax I b to (£. | 1)* — a I 1- 



The relationship between an action on p n and its corresponding action on the vector of coefficients in j ^"+ 1 , and the similarities 
between p n and i? H+1 , will be explored in more detail later in this text. 

Interpolating Polynomials 

Consider the problem of interpolating a polynomial to a set of ^ | 1 points (jrg, 70)' •••> (x nr y n )- That is, we seek to find a curve 

p(x) = a m x m I fl m _iA- m_1 H h a\x 4- aq °f minimum degree that goes through each of these data points (Figure 4.4.2). Such a 

curve must satisfy 




Figure 4.4.2 



Interpolation 



yi =a m x™-\-a m -ix™~ { +... + fll ;q + a 



7h = Vh+^-1^ 1 -h- + fliX H +fl 



Because the x^ are known, this leads to the following matrix system: 

1 XQ x£ 

1 xi x\ 



1 x n _i X. 
1 x„ 



2 

M-l 





fl 




" 70 


*1 


fli 





71 




Am J 




7h-1 
7h 


■-■ X™ 







Note that this is a square system when ^=m- Taking n = m gives the following system for the coefficients of the interpolating 
polynomial p(x): 

1 xq x£ 
1 xi x\ 



1 *M_ 



H-l *„_1 



1 



... r" 


fl 




" 70 


*1 


fll 





71 


*H-1 






7h-1 
7h 


A n 







(1) 



The matrix in 1 is known as a Vandermonde matrix; column j is the second column raised element wise to the j — 1 power. The 



linear system in 1 is said to be a Vandermonde system. 



EXAMPLE 5 Interpolating a Cubic 



To interpolate a polynomial to the data (-2, 1 1), (-1, 2), (1, 2), (2, -1), we form the Vandermonde system 1: 



1 XQ xl 

1 xi xj 


4 
4 


"a " 




71 


1 x 2 x 2 
1 X2 X3 


4 
4 


&2 
a 3 




72 
73 



For this data, we have 



1 


-2 4 


1 


-1 1 


1 


1 1 


1 


2 4 



8~ 


"fl " 




11" 


1 


ax 




2 


1 


<*2 




2 


8 


a 3 




-1 



The solution, found by Gaussian elimination, is 



~tfO~ 




r 


fll 

<*2 


= 


i 
i 


tf 3 




-i 



and so the interpolant is p(x) = — x 3 I x 2 \-x + 1. This is plotted in Figure 4.4.3, together with the data points, and we see that 
p(^x) does indeed interpolate the data, as required. 




Figure 4.4.3 



The interpolant of Example 4 



Newton Form 



The interpolating polynomial p{ x ] = a^x n I a n -\x n ~^ I — I a\x I aq * s sa ^ t0 ^ e wf i tten i n i ts natural, or standard, form. But 
there is convenience in using other forms. For example, suppose we seek a cubic interpolant to the data (^ 0? ^ ), (x\ 9 yi)> (x2, 72) 
' (x 3? 73). If we write 



3 2 

p(x)=a^K + (32* +(31^+130 



(2) 



in the equivalent form 



p(x) =a^(x-xn) 3 1 arfx-xn) 2 I ai (jt — jrn) I an 



then the interpolation condition p(xq) =y\} immediately gives ag =y\}. This reduces the size of the system that must be solved 
from (k-\- l)x(tf + l)tOflxfl- That is not much of a savings, but if we take this idea further, we may write 2 in the equivalent 
form 



which is called the Newton form of the interpolant. Set ^ . — ^ . _ x _ 1 for j — l, 2, 3. The interpolation conditions give 

p(x\)=biki+b u 

H*3) = £3(^1 +^2 + ^3) (^2 + ^3)^3 I ^2(^1 I A2 + A3XA2 I A3) I ii(Ai +A2 + A3) +^0 



(3) 



that is, 



10 

1 Ai 

1 ^1+^2 (^1+^2)^2 

1 h\ I ^2 I ^3 C^l I ^2 I ^3)(^2 I ^3) C^l I ^2 I A3) (^2 I ^3)^3 



IV 




>o~ 


£1 




71 


*2 




72 


|_*3 




73 



(4) 



Unlike the Vandermonde system 1, this system has a lower triangular coefficient matrix. This is a much simpler system. We may 
solve for the coefficients very easily and efficiently by forward-substitution, in analogy with back-substitution. In the case of 
equally spaced points arranged in increasing order, we have kj = k > 0> so 4 becomes 



1 








1 


h 





1 


2k 2k 2 





1 


3k 6k 2 


6k 3 



IV 




>o" 


Al 




71 


h 




72 


[h 




73 



Note that the determinant of 4 is nonzero exactly when ^ . is nonzero for each i, so there exists a unique interpolant whenever the x 2 
are distinct. Because the Vandermonde system computes a different form of the same interpolant, it too must have a unique 
solution exactly when the Xj are distinct. 



EXAMPLE 6 Interpolating a Cubic in Newton Form 



IV 




"if 


h 




2 


h 




2 


[h 




1 



To interpolate a polynomial in Newton form to the data (-2, 1 1), (-1, 2), (1, 2), (2,-1) of Example 5, we form the system 4: 

1 o o" 

110 

13 6 

l 1 4 12 12 

The solution, found by forward-substitution, is 

b a = 11 
b Q + bi= 2 bi= -9 
b a + 3b l + 6b 2 = 2 b 2 = 3 
b +4&i + 12&2+ 12i 3 = -1 i 3 =-l 
and so, from 3, the interpolant is 

*>(*) = -l-(* I 2)(x I l)(x-l) I 3-(* I 2)(x+l)H (-9)-(x I- 2) + 11 
= -(i + 2)(x +!)(*-!) I 3(i I 2)(x I l)-9(x I 2) I 11 



Converting between Forms 



The Newton form offers other advantages, but now we turn to the following question: If we have the coefficients of the 
interpolating polynomial in Newton form, what are the coefficients in the standard form? For example, if we know the coefficients 
in 

because we have solved 4 in order to avoid having to solve the more complicated Vandermonde system 1, how can we get the 
coefficients in 2, 

3 2 

p(x)=a^K + (32* 4- i3ixH-i3o 

from £ Q , £j, £ 2 5 63? Expanding the products in 3 gives 

= bix -I (i 2 — ^3(^0 I *l+*2))* 

I (^1-^2(^0 I *l) I £3(*0*1 I *0*2 I *1*2))* 



so 



a\}=b\}-x\}bi I x ^ii2-^0^1^2^3 
^1=^1-^2(^0 I *l) I ^3(*2*1 I *o*2+*l*2) 
^2=^2-^3(^0 I *1 I ^2) 
133 = £3 



This can be expressed as 



fl " 




1 


-*o 


fll 







1 


<*2 










fl 3 











(* I x\) ^o^l I *o*2 + *i*2 
1 - (* 1^1+^2) 

1 



h 
h 
h 



(5) 



This is an important result! Solving the Vandermonde system 1 by Gaussian elimination would require us to form an n x n matrix 
that might have no nonzero entries and then to solve it using a number of arithmetic operations that grows in proportion to ^ for 
large n. But solving the lower triangular system 4 requires an amount of work that grows in proportion to n 2 for large n, and using 
5 to compute the coefficients a\} 9 a\ 9 aj, ^3 also requires an amount of work that grows in proportion to n 2 for large n. Hence, for 
large n, the latter approach is an order of magnitude more efficient. The two-step procedure of solving 4 and then using the linear 
transformation 5 is a superior approach to solving 1 when n is large (Figure 4.4.4). 
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Figure 4.4.4 



Indirect route to conversion from Newton form to standard form 



EXAMPLE 7 Changing Forms 



In Example 4 we found that a $ = ], a{ = ], a2 = h a 3 = 
i 3 = _ 1 for the same data. From 5, with XQ — _ 2, x\ 



— l, whereas in Example 5 we found that ig = 1 1, ij 
: — 1» x\ = 1» we expect that 



= - 9, £ 9 = 3, 



1 2 2 

1 3 

1 
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11 
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-9 


2 


3 


1 


-1 



which checks. 



There is another approach to solving 1, based on the Fast Fourier Transform, that also requires an amount of work proportional to 
ft 2 . The point for now is to see that the use of linear transformations on ^ H+1 can help us perform computations involving 
polynomials. The original problem — to fit a polynomial of minimum degree to a set of data points — was not couched in the 
language of linear algebra at all. But rephrasing it in those terms and using matrices and the notation of linear transformations on 
^h+1 hag allowed us to see when a unique solution must exist, how to compute it efficiently, and how to transform it among 
various forms. 



Exercise Set 4.4 



o 



Click here for Just Ask! 



Identify the operations on polynomials that correspond to the following operations on vectors. Give the resulting polynomial. 



(a) 



f 




"3" 


2 


-2 





-1 




2 



(b) 



~A 




"1" 


3 


1 6 


2 







1 



(c) 



f 




o" 


2 




2 


1 


— 





-2 




-2 


1 








(d) 



~ 



2. 



(a) Consider the operation on p 2 that takes a x 2 I bx I c to ex 2 -\-bx -h a- Does it correspond to a linear transformation 
from ^ 3 to pi? If so, what is its matrix? 



(b) Consider the operation on p^ that takes ax^ 4- bx 2 + ex -f d to cx^ — bx 2 — ax I d. Does it correspond to a linear 
transformation from p^ to p^l If so, what is its matrix? 



3. 



(a) Consider the transformation of ax 2 I bx I c in p 2 to \a\ in p^. Show that it does not correspond to a linear 
transformation by showing that there is no matrix that maps ( a? b, c) in R^ to \a\ in /?. 

(b) Does the transformation of ax 2 + bx + c in p 2 to a in p^ correspond to a linear transformation from p^ to Rl 



4. 



(a) Consider the operation ][{■ p 2 > P 3 that takes p( x ) in P 2 t0 *pOO i n ^3* Does this correspond to a linear 

transformation from p} to ^ 4 ? If so, what is its matrix? 



(b) Consider the operation ^- p 2 > P 3 that takes p( x ) in P H to (^ — l)^?(x) in ,P H+ i- Does this correspond to a linear 

transformation from p^ to j? 4 ? If so, what is its matrix? 



(c) Consider the operation j^ : p 2 » p 3 that takes p( x ) in p H to xp(x) I 1 in P n +\- Does this correspond to a linear 

transformation from j? 3 to ^ 4 ? If so, what is its matrix? 



5. (For Readers Who Have Studied Calculus) What matrix corresponds to differentiation in each case? 

(a) D: P 3 — ► P 2 

(b) D: P 4 — > P3 

(c) D: P 5 — > P 4 



6. (For Readers Who Have Studied Calculus) What matrix corresponds to differentiation in each case, assuming we represent 

p(x) = a H / + a H -i*" _1 4- - + a { x + ^o as the vector (ao- ai- ---> a M -l- ^ H ) ? 

Afote This is the opposite of the ordering of coefficients we have been using. 



(a) D: P 3 — > P 2 

(b) D: P 4 — ^3 

(c) D: P 5 > P 4 



Consider the following matrices. What is the corresponding transformation on polynomials? Indicate the domain p i and the 
' • codomain P,. 



(a) 



1 1 
1 -1 



(b) 



"1 


0" 


1 


1 


2 


-1 



(c) 



1 2 

2 1 1 



(d) 



"0 





0" 





1 















(e) [0 1 0] 



8. 



Consider the space of all functions of the form a \ £. cos(x) I c sin(x) where a, b, c are scalars. 



(a) What matrix, if any, corresponds to the change of variables K > x — - / 2. assuming that we represent a function in 

this space as the vector (a ? b ? c)^ 



(b) What matrix corresponds to differentiation of functions on this space? 



Consider the space of all functions of the form a f . ^ + ce f 4. de f , where a, b, c, d are scalars. 



(a) What function in the space corresponds to the sum of (1, 2, 3, 4) and (-1, -2, 0, -1), assuming that we represent a 
function in this space as the vector (a,b,c ? d)7 



(b) Is cosh (0 in this space? That is, does cosh(£) correspond to some choice of a, b, c, dl 



(c) What matrix corresponds to differentiation of functions on this space? 



10. 



Show that the Principle of Superposition is equivalent to Theorem 4.3.2. 



11. 



Show that an affine transformation with/ nonzero is not a linear transformation. 



12. 



Find a quadratic interpolant to the data (-1, 2), (0, 0), (1, 2) using the Vandermonde system approach. 



13. 



(a) Find a quadratic interpolant to the data (-2, 1), (0, 1), (1, 4) using the Vandermonde system approach from 1. 



(b) Repeat using the Newton approach from 4. 



14. 



(a) Find a polynomial interpolant to the data (-1, 0), (0, 0), (1, 0), (2, 6) using the Vandermonde system approach from 1. 



(b) Repeat using the Newton approach from 4. 



(c) Use 5 to get your answer in part (a) from your answer in part (b). 



(d) Use 5 to get your answer in part (b) from your answer in part (a) by finding the inverse of the matrix. 



(e) What happens if you change the data to (-1, 0), (0, 0), (1, 0), (2, 0)? 



15. 



(a) Find a polynomial interpolant to the data (-2, -10), (-1, 2), (1, 2), (2, 14) using the Vandermonde system approach 
from 1. 



(b) Repeat using the Newton approach from 4. 



(c) Use 5 to get your answer in part (a) from your answer in part (b). 



(d) Use 5 to get your answer in part (b) from your answer in part (a) by finding the inverse of the matrix. 



16. 



Show that the determinant of the 2 x 2 Vandermonde matrix 



1 a 
1 b 



can be written as (h — a) and that the determinant of the 3 x 3 Vandermonde matrix 

i a cr 
det 



1 a a J 
1 h h 2 



1 c c 2 
can be written as (£. _ a ) (c — a) (c — b)> Conclude that a unique straight line can be fit through any two points (xq, 70)' 



0*1* y\) w ^ *0 and ^i distinct, and that a unique parabola (which may be degenerate, such as a line) can be fit through any 
three points O ,j/ ), (x\ 7 y\), (x&yi) with *0> *1> and x 2 distinct. 

17. 

(a) What form does 5 take for lines? 

(b) What form does 5 take for quadratics? 

(c) What form does 5 take for quartics? 



Discussion 
Discovery 



18. (For Readers Who Have Studied Calculus) 



(a) Does indefinite integration of functions in p n correspond to some linear transformation from 
^"+1 to ^»+ 2 ? 



(b) Does definite integration (from jr = Qto JC =l)of functions in p^ correspond to some linear 
transformation from ^ H+1 to /?? 



n 



19. (For Readers Who Have Studied Calculus) 

(a) What matrix corresponds to second differentiation of functions from p 2 (giving functions in 

Po) ? 



(b) What matrix corresponds to second differentiation of functions from p^ (giving functions in 



po? 



(c) Is the matrix for second differentiation the square of the matrix for (first) differentiation? 
Consider the transformation from p 7 to p 2 associated with the matrix 

20. ^ r^ rt rt - 


1 


and the transformation from p 2 to p^ associated with the matrix 

[0 1 0] 

These differ only in their codomains. Comment on this difference. In what ways (if any) is it 
important? 

The third major technique for polynomial interpolation is interpolation using Lagrange 

21. interpolating polynomials . Given a set of distinct x- values xq, x\ 9 ... x H , define the # + 1 Lagrange 
interpolating polynomials for these values by (for j — 0, 1, ... n) 



L /.^ = - JTq) (x - Xj)-{x - JTf-l) Q - JTj + O-Qr - JT„) 

Note that Z 2 (x) is a polynomial of exact degree n and that L^xA = if i * J, and Z 2 (jc 2 ) = 1. It 
follows that we can write the polynomial interpolant to (jcq, 7o)' *' (^ HJi y H )i nt he form 

where cj =72, i = Q, 1, ...,^. 



(a) Verify that ^ (*) = 7 z; (*) | 7 ^ j (*) + - + y H £ H 0) is the unique interpolating 
polynomial for this data. 



(b) What is the linear system for the coefficients cq, c\, . . ., c n , corresponding to 1 for the 
Vandermonde approach and to 4 for the Newton approach? 



(c) Compare the three approaches to polynomial interpolation that we have seen. Which is most 
efficient with respect to finding the coefficients? Which is most efficient with respect to 
evaluating the interpolant somewhere between data points? 



Generalize the result in Problem 16 by finding a formula for the determinant of an n x n 
22. Vandermonde matrix for arbitrary n. 



23. 



The norm of a linear transformation Tj±. R* 



+ R n can be defined by 



II 7*11 jj = max 

ii^ii 

where the maximum is taken over all nonzero x in R". (The subscript indicates that the norm of the 

linear transformation on the left is found using the Euclidean vector norm on the right.) It is a fact 

that the largest value is always achieved — that is, there is always some xg in R n such that 

|| 7*|| s = mas(||T(io) || / ||xoll)- What are the norms of the linear transformations J A with the 

following matrices? 



(a) 



2 
1 



(b) 



1 
-1 



(c) 



2 
-3 



(d) 



\i{l 1//2 
\l{l -1//2 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



Chapter 4 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Section 4.1 

Tl. (Vector Operations in R n ) With most technology utilities, the commands for operating on vectors in R n are the same as 
those for operating on vectors in p^ and ^ 3 , and the command for computing a dot product produces the Euclidean inner 
product in R n . Use your utility to perform computations in Exercises 1, 3, and 9 of Section 4.1. 

Section 4.2 

Tl. (Rotations) Find the standard matrix for the linear operator on p} that performs a counterclockwise rotation of 45° about the 
x-axis, followed by a counterclockwise rotation of 60° about the y-axis, followed by a counterclockwise rotation of 30° about 
the z-axis. Then find the image of the point (1, 1, 1) under this operator. 

Section 4.3 

Tl. (Projections) Use your utility to perform the computations for Q — K / g in Example 6. Then project the vectors (1, 1) and (1, 
-5). Repeat for e = n/A, W 3, W 2, d- 

Section 4.4 

Tl. (Interpolation) Most technology utilities have a command that performs polynomial interpolation. Read your 

documentation, and find the command or commands for fitting a polynomial interpolant to given data. Then use it (or them) 
to confirm the result of Example 5. 
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5 



CHAPTER 



General Vector Spaces 



INTRODUCTION: In the last chapter we generalized vectors from 2- and 3-space to vectors in n-space. In this chapter 

we shall generalize the concept of vector still further. We shall state a set of axioms that, if satisfied by a class of objects, will 
entitle those objects to be called "vectors." These generalized vectors will include, among other things, various kinds of 
matrices and functions. Our work in this chapter is not an idle exercise in theoretical mathematics; it will provide a powerful 
tool for extending our geometric visualization to a wide variety of important mathematical problems where geometric intuition 
would not otherwise be available. We can visualize vectors in r 2 and R 3 as arrows, which enables us to draw or form mental 
pictures to help solve problems. Because the axioms we give to define our new kinds of vectors will be based on properties of 
vectors in r 2 and R 3 , the new vectors will have many familiar properties. Consequently, when we want to solve a problem 
involving our new kinds of vectors, say matrices or functions, we may be able to get a foothold on the problem by visualizing 
what the corresponding problem would be like in r 2 and R 3 . 
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5.1 

REAL VECTOR SPACES 



In this section we shall extend the concept of a vector by extracting the 
most important properties of familiar vectors and turning them into axioms. 
Thus, when a set of objects satisfies these axioms, they will automatically 
have the most important properties of familiar vectors, thereby making it 
reasonable to regard these objects as new kinds of vectors. 



Vector Space Axioms 

The following definition consists of ten axioms. As you read each axiom, keep in mind that you have already seen each of 
them as parts of various definitions and theorems in the preceding two chapters (for instance, see Theorem 4.1.1). 
Remember, too, that you do not prove axioms; they are simply the "rules of the game." 



DEFINITION 




Let Vbe an arbitrary nonempty set of objects on which two operations are defined: addition, and multiplication by scalars 
(numbers). By addition we mean a rule for associating with each pair of objects u and v in V an object u | v , called the 
sum of u and v; by scalar multiplication we mean a rule for associating with each scalar k and each object u in V an 
object £u, called the scalar multiple of u by k. If the following axioms are satisfied by all objects u , v> w i n V and all 
scalars k and m, then we call V a vector space and we call the objects in V vectors. 


1. If u and v are objects in V, then u | v is in jr 


2. u + v = v + u 


3. u + (v + w) = (u + v) + w 


4. There is an object in jr, called a zero vector for pr such that 4- u = u + = u for all u in V. 


5. For each u in p", there is an object _ u in p, called a negative of u , such that u | ( — u) = ( — u) 1 u = 0- 


6. If k is any scalar and u is any object in jr then £u is in jr. 


7. i(u + v) = k\\ + kv 


8. (k + m)\\ = h\-\- m\\ 


9. k(mu) = (km) (u) 



10. lu = u 



Remark Depending on the application, scalars may be real numbers or complex numbers. Vector spaces in which the 
scalars are complex numbers are called complex vector spaces, and those in which the scalars must be real are called real 
vector spaces. In Chapter 10 we shall discuss complex vector spaces; until then, all of our scalars will be real numbers. 

The reader should keep in mind that the definition of a vector space specifies neither the nature of the vectors nor the 
operations. Any kind of object can be a vector, and the operations of addition and scalar multiplication may not have any 
relationship or similarity to the standard vector operations on R n . The only requirement is that the ten vector space axioms 
be satisfied. Some authors use the notations £fi and (J) for vector addition and scalar multiplication to distinguish these 
operations from addition and multiplication of real numbers; we will not use this convention, however. 

Examples of Vector Spaces 

The following examples will illustrate the variety of possible vector spaces. In each example we will specify a nonempty set 

V 

and two operations, addition and scalar multiplication; then we shall verify that the ten vector space axioms are satisfied, 

thereby entitling V 9 with the specified operations, to be called a vector space. 



EXAMPLE 1 R n Is a Vector Space 

The set V = R* 1 with the standard operations of addition and scalar multiplication defined in Section 4.1 is a vector space. 
Axioms 1 and 6 follow from the definitions of the standard operations on R™; the remaining axioms follow from Theorem 
4.1.1. 

The three most important special cases of R™ are R (the real numbers), R 2 (the vectors in the plane), and J? 3 (the vectors in 
3 -space). 



EXAMPLE 2 A Vector Space of 2 x 2 Matrices 

Show that the set V of all 2 x 2 matrices with real entries is a vector space if addition is defined to be matrix addition and 
scalar multiplication is defined to be matrix scalar multiplication. 

Solution 

In this example we will find it convenient to verify the axioms in the following order: 1, 6, 2, 3, 7, 8, 9, 4, 5, and 10. Let 



u = 



"21 "22 



and v = 



vn V12 
V21 v 22 



To prove Axiom 1, we must show that u + v is an object in V\ that is, we must show that u | v is a 2 x 2 matrix. But this 
follows from the definition of matrix addition, since 



U + V = 



ail ai2 
"21 "22 



vil V12 
^21 v 2 2 



Similarly, Axiom 6 holds because for any real number k, we have 



ku = k 
so in is a 2 x 2 matrix and consequently is an object in V. 



"11 "12 
"21 "22 



"lH vii U12 I V12 
"21 I v 21 "22 +V22 



ku n ku\2 
£"21 ^"22 



Axiom 2 follows from Theorem \A.\a since 

""11 "12 



u + v = 



"21 "22 



I 



vn V12 
^21 v 22 



vn V12 
V21 v 22 



i 



"11 "12 
"21 "22 



= v + u 



Similarly, Axiom 3 follows from part (b) of that theorem; and Axioms 7, 8, and 9 follow from parts (ti), (/), and (/), 
respectively. 

To prove Axiom 4, we must find an object in V such that hu = u + 0=u for all u in V. This can be done by defining 
to be 



= 








With this definition, 



+ u = 



ro oi 






+ 


[o uj 





"11 "12 
"21 "22 



"11 "12 
"21 "22 



= u 



and similarly u + = u- To prove Axiom 5, we must show that each object u in V has a negative _ u such that 
u + ( — u) = an d ( — u) + u = 0- This can be done by defining the negative of u to be 

"11 -"12" 



With this definition, 



u+(-u) = 



— u = 



"11 "12 
"21 "22 



-"21 -"22 



-"11 


-"12" 




"0 0" 


-"21 


~"22_ 




0_ 



= 



and similarly ( — u) + u = 0- Finally, Axiom 10 is a simple computation: 



lu=l 



""11 


"12" 




""11 "12" 


"21 


"22 _ 




"21 "22 



= 11 



EXAMPLE 3 A Vector Space of m x n Matrices 



Example 2 is a special case of a more general class of vector spaces. The arguments in that example can be adapted to show 
that the set V of all m x n matrices with real entries, together with the operations of matrix addition and scalar multiplication, 
is a vector space. The mxn zero matrix is the zero vector 0, and if u is the m x n matrix U, then the matrix _ u is the 
negative _ u of the vector u . We shall denote this vector space by the Symbol j\£ . 



EXAMPLE 4 A Vector Space of Real-Valued Functions 



Let V be the set of real- valued functions defined on the entire real line ( — oo, oo )• If f = / (x) and g = g(x) are two such 



functions and k is any real number, define the sum function f + g and the scalar multiple jfcf , respectively, by 

(f + s)( x )=f<x)-\ S U) and <kt)(x)=kf(x) 

In other words, the value of the function f | g at x is obtained by adding together the values of f and g at x (Figure 5.1.1a). 
Similarly, the value of kf at x is k times the value of f at x (Figure 5. 1 . lb). In the exercises we shall ask you to show that V is 
a vector space with respect to these operations. This vector space is denoted by^f— oo ? do)- Iff and g are vectors in this 
space, then to say that f = g is equivalent to saying that f (%) = g(^) for all x in the interval ( _ oo , oo ) • 
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Figure 5.1.1 

The vector in p ( — oo , oo ) is the constant function that is identically zero for all values of x. The graph of this function is 
the line that coincides with the x-axis. The negative of a vector/ is the function _ f = _ f [ x ). Geometrically, the graph of 
— f is the reflection of the graph of f across the x-axis (Figure 5.1.1c). 



Remark In the preceding example we focused on the interval ( _ oo , oo ) • Had we restricted our attention to some closed 
interval [a r b] or some open interval ( fl? £), the functions defined on those intervals with the operations stated in the 
example would also have produced vector spaces. Those vector spaces are denoted by F [a, b] and f( a? &), respectively. 



Let JT = R 2 and define addition and scalar multiplication operations as follows: If u = (u\, uj) an ^ v = (vi, V2)> then define 
EXamklE 5 A Set That Is Not a Vector ^p*™* f . 



and if k is any real number, then define 

h\ = (ku\, 0) 
For example, if u = (2, 4), v = ( - 3, 5), and k = 7, then 

u + v=(2+(-3),4 + 5) = (-l,9) 
£u = 7u= (7-2,0) = (14,0) 

The addition operation is the standard addition operation on j? 2 , but the scalar multiplication operation is not the standard 
scalar multiplication. In the exercises we will ask you to show that the first nine vector space axioms are satisfied; however, 
there are values of u for which Axiom 10 fails to hold. For example, if u = (& 1? u 2 ) * s suc h that u 2 * 0> then 

lu= l(uuu 2 ) = (1 -ui, 0) = (uu 0) *u 
Thus V is not a vector space with the stated operations. 



EXAMPLE 6 Every Plane through the Origin Is a Vector Space 



Let V be any plane through the origin in j? 3 . We shall show that the points in V form a vector space under the standard 
addition and scalar multiplication operations for vectors in j? 3 . From Example 1, we know that J? 3 itself is a vector space 
under these operations. Thus Axioms 2, 3, 7, 8, 9, and 10 hold for all points in pi and consequently for all points in the 
plane V. We therefore need only show that Axioms 1, 4, 5, and 6 are satisfied. 



Since the plane V passes through the origin, it has an equation of the form 

ax + by + cz = 



(1) 



(Theorem 3.5.1). Thus, if u = ( Uu U2? U3 ) and v = (v 1? v 2 , v 3 ) are points in V, then aui \ bu 2 I cu 3 = and 
av\ I bv2 I cv 2 = 0- Additing these equations gives 

a(u] + V]) +b(u7 4- vj) I c(u^ I v^) = 
This equality tells us that the coordinates of the point 

u + v= (hi +vi,H2 I V2,H3 I V3) 
satisfy 1 ; thus u | v lies in the plane V. This proves that Axiom 1 is satisfied. The verifications of Axioms 4 and 6 are left as 
exercises; however, we shall prove that Axiom 5 is satisfied. Multiplying flUl | £ U2 | cu ^ = through by _ 1 gives 

a( — u\) 4- b( — U2) I c( — U3) = 
Thus —\\=( — ui ? —U2, — U3) li es i n V. This establishes Axiom 5. 



EXAMPLE 7 The Zero Vector Space 



Let V consist of a single object, which we denote by 0, and define 

n 1 n = n and kfi = a 

for all scalars k. It is easy to check that all the vector space axioms are satisfied. We call this the zero vector space. 

4 

Some Properties of Vectors 

As we progress, we shall add more examples of vector spaces to our list. We conclude this section with a theorem that gives 
a useful list of vector properties. 



THEOREM 5.1.1 



Let V be a vector 

(a) 0u = 

(b) *o = o 


space, 


u a vector in V, and k c 
= Qor ll = 0. 


i scalar; 


then: 


(c) (-1)11 = 


— ii 


(d) /Au = 0, 


then k 



We shall prove parts (a) and (c) and leave proofs of the remaining parts as exercises. 



Proof (a) We can write 

Ou + Ou = (0 + 0)u [Axiom 8] 

= Ou [Property of the number ] 

By Axiom 5 the vector Ou has a negative, -On- Adding this negative to both sides above yields 

[Ou I Ou] I ( - Ou) = Ou + ( - Ou) 

or 

Ou + [Ou + ( - Ou) ] = Ou 4 ( - Ou) [Axiom3] 

0u + = [Axiom 5] 

0u = [Axiom 4] 



Proof (c) To show that ( _ l )u = _ u , we must demonstrate that u | ( — 1 )u = 0- To see this, observe that 



u+( — \)u = lu I ( — l)u 

= (! + (-!))« 
= 0u 
= 



[AxiomlO] 
[Axiom S] 

[Pi'opeiiy of munbers] 
[Pait (a) above] 



Exercise Set 5.1 



& 



Click here for Just Ask! 



In Exercises 1-16 a set of objects is given, together with operations of addition and scalar multiplication. Determine which 
sets are vector spaces under the given operations. For those that are not vector spaces, list all axioms that fail to hold. 

The set of all triples of real numbers (x, y, z) with the operations 
L (x,y,z) I (x*,y*,z') = (x + x*,y+y*,z + z*) and k(x,y,z) = (kx,y,z) 

The set of all triples of real numbers (x, y, z) with the operations 

2 - x,y,z) + (/,y',z r ) = (x +x f ,y +y f ,z + z>) and k(x,y,z) = (0, 0, 0) 

The set of all pairs of real numbers (x, y) with the operations 

3 - 0, y) + (*', y f ) = (x + x\y + y f ) and k(x, y) = (2kx, 2ky) 



4. 



The set of all real numbers x with the standard operations of addition and multiplication. 



5. 



The set of all pairs of real numbers of the form (x,0) with the standard operations on j? 2 . 



The set of all pairs of real numbers of the form (x,y), where x > 0, with the standard operations on j? 2 . 



7. 



The set of all ^-tuples of real numbers of the form (x,x, ...,x) with the standard operations on ^ H . 



8. 



The set of all pairs of real numbers (x,y) with the operations 

(x,y)+ (x*,y*) = (x + x f + l,y +y f + 1) and k(x ? y) = (kx, ky) 



9. 



10. 



The set of all 2 x 2 matrices of the form 

'a r 
1 b 

with the standard matrix addition and scalar multiplication. 
The set of all 2 x 2 matrices of the form 

b 
with the standard matrix addition and scalar multiplication. 



The set of all real- valued functions /defined everywhere on the real line and such that / (1) =0, with the operations 
11* defined in Example 4. 



12. 



The set of all 2 x 2 matrices of the form 

a a + b 
a + b b 

with matrix addition and scalar multiplication. 



13. 



The set of all pairs of real numbers of the form ( 1 , *) with the operations 



The set of polynomials of the form a I bx with the operations 

14# (flo I fli*) I C^O I *1*) — ( fl + *o) + (fli + ^i)* and £(ao I a\x) = (tag) + (ka\)x 



The set of all positive real numbers with the operations 

x 4- y = xy and kx = x 



16. 



The set of all pairs of real numbers (x,y) with the operations 



J .,.,' 



0, y) + (x\ /) = (xx\ yy f ) and k(x, y) = (kx, ky) 



17. 



Show that the following sets with the given operations fail to be vector spaces by identifying all axioms that fail to hold. 

(a) The set of all triples of real numbers with the standard vector addition but with scalar multiplication defined by 

k(x,y,z) = (k 2 x,k 2 y,k 2 z). 



(b) The set of all triples of real numbers with addition defined by (x ? y ? z) I (y ? v,w) = (z + wj + v^4 u) and 
standard scalar multiplication. 



(c) The set of all 2 x 2 invertible matrices with the standard matrix addition and scalar multiplication. 



jg Show that the set of all 2 x 2 matrices of the form 

~ -i 

and scalar multiplication defined by k . _ 

1 b 



a 1 
1 b 
ka 1 
1 kb 



'a r 


1 


~c r 




_1 b_ 


■+■ 


1 d_ 





a + c 1 
1 b\d 



with addition defined by 

is a vector space. What is the zero vector in this space? 



19. 



(a) Show that the set of all points in j? 2 lying on a line is a vector space, with respect to the standard operations of 
vector addition and scalar multiplication, exactly when the line passes through the origin. 



(b) Show that the set of all points in j? 3 lying on a plane is a vector space, with respect to the standard operations of 
vector addition and scalar multiplication, exactly when the plane passes through the origin. 



Consider the set of all 2 x 2 invertible matrices with vector addition defined to be matrix multiplication and the standard 

20. scalar multiplication. Is this a vector space? 

Show that the first nine vector space axioms are satisfied if $r — r 2 has the addition and scalar multiplication operations 

21, defined in Example 5. 



22. 



Prove that a line passing through the origin in j? 3 is a vector space under the standard operations on j? 3 . 



23. 



Complete the unfinished details of Example 4. 



24. 



Complete the unfinished details of Example 6. 



Discussion 
Discov&ry 



We showed in Example 6 that every plane in gl that passes through the origin is a vector 
25 • space under the standard operations on j? 3 . Is the same true for planes that do not pass through 
the origin? Explain your reasoning. 

It was shown in Exercise 14 above that the set of polynomials of degree 1 or less is a vector 

26. space under the operations stated in that exercise. Is the set of polynomials whose degree is 
exactly 1 a vector space under those operations? Explain your reasoning. 

Consider the set whose only element is the moon. Is this set a vector space under the 

27. operations moon + moon = moon and £(moon)=moon for every real number kl Exaplain your 
reasoning. 

Do you think that it is possible to have a vector space with exactly two distinct vectors in it? 

28. Explain your reasoning. 



29. 



The following is a proof of part (b) of Theorem 5.1.1. Justify each step by filling in the blank 
line with the word hypothesis or by specifying the number of one of the vector space axioms 
given in this section. 

Hypothesis: Let u be any vector in a vector space V, the zero vector in V, and k a scalar. 

Conclusion: Then £0 = 0- 

Proof: 



1. First, kO I Aii = jfc(0+u).. 



2. 



= An. 



3. Since £u is in V, _ £u is in V. 



4. Therefore, (H) | Aii) I (-jhi)=iiH ( — Jtii). 



5. kti I (Au+(-Au))=Au+ ( — iii) 



6. 



A0 I 0=0 



30. 



7. Finally, kO=Q. 

Prove part (d) of Theorem 5.1.1. 



The following is a proof that the cancellation law for addition holds in a vector space. Justify 
31. each step by filling in the blank line with the word hypothesis or by specifying the number of 
one of the vector space axioms given in this section. 

Hypothesis: Let w, v, and w be vectors in a vector space V and suppose that u + w = v + w- 

Conclusion: Then u = v . 

Proof: 



1. First, ( u + w ) + ( _ vr) and ( v | w) \ ( - w) are vectors in V. 

2. Then ( u 4 w ) -4 ( _ w ) = ( v | w) I ( - w). 



3. The left side of the equality in step (2) is ( u 4. w ) 4 ( — w) = u + (w 4 ( — iv) ) 



= u. 



4. The right side of the equality in step (2) is ( v f w ) + ( — w) = v + (w 4- ( — w) ) 



= v. 



From the equality in step (2), it follows from steps (3) and (4) that u = y. 

Do you think it is possible for a vector space to have two different zero vectors? That is, is it 

32. possible to have two different vectors p 1 and p 1 such that these vectors both satisfy Axiom 4? 
Explain your reasoning. 

Do you think that it is possible for a vector u in a vector space to have two different 

33. negatives? That is, is it possible to have two different vectors ( — n) 1 and ( _ u ) 2 , both of 
which satisfy Axiom 5? Explain your reasoning. 

The set of ten axioms of a vector space is not an independent set because Axiom 2 can be 

34. deduced from other axioms in the set. Using the expression 

(u I v) - (v I u) 
and Axiom 7 as a starting point, prove that u + v = v4 n- 

Hint You can use Theorem 5.1.1 since the proof of each part of that theorem does not use 
Axiom 2. 
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5 a It is possible for one vector space to be contained within another vector 

■^ space. For example, we showed in the preceding section that planes 

SU BS PACES through the origin are vector spaces that are contained in the vector space 

R 3 . In this section we shall study this important concept in detail. 



A subset of a vector space V that is itself a vector space with respect to the operations of vector addition and scalar 
multiplication defined on V is given a special name. 



DEFINITION 



A subset W of a vector space V is called a subspace of V if W is itself a vector space under the addition and scalar 
multiplication defined on V. 



In general, one must verify the ten vector space axioms to show that a set W with addition and scalar multiplication forms a 
vector space. However, if Wis part of a larger set V that is already known to be a vector space, then certain axioms need not 
be verified for W because they are "inherited" from V. For example, there is no need to check that u _|_ v = v I n (Axiom 2) 
for W because this holds for all vectors in V and consequently for all vectors in W. Other axioms inherited by Wfrom V are 3, 
7, 8, 9, and 10. Thus, to show that a set Wis a subspace of a vector space V, we need only verify Axioms 1, 4, 5, and 6. The 
following theorem shows that even Axioms 4 and 5 can be omitted. 

THEOREM 5.2.1 



If Wis a set of one or more vectors from a vector space V, then Wis a subspace of V if and only if the following 
conditions hold. 

(a) Ifu and v are vectors in W, then u | v is in W. 

(b) If k is any scalar and u is any vector in W, then £u is in W. 



Proof If Wis a subspace of V, then all the vector space axioms are satisfied; in particular, Axioms 1 and 6 hold. But these 
are precisely conditions (a) and (b). 

Conversely, assume conditions (a) and (b) hold. Since these conditions are vector space Axioms 1 and 6, we need only show 
that W satisfies the remaining eight axioms. Axioms 2, 3, 7, 8, 9, and 10 are automatically satisfied by the vectors in W since 
they are satisfied by all vectors in V. Therefore, to complete the proof, we need only verify that Axioms 4 and 5 are satisfied 
by vectors in W. 

Let u be any vector in W. By condition (b), £u is in Wfor every scalar k. Setting k — Q, it follows from Theorem 5.1.1 that 
Ou = is in W, and setting £ — — 1, it follows that ( _ l) u = — n is in W. 



Remark A set W of one or more vectors from a vector space V is said to be closed under addition if condition (a) in 
Theorem 5.2.1 holds and closed under scalar multiplication if condition (b) holds. Thus Theorem 5.2.1 states that W is a 
subspace of V if and only if Wis closed under addition and closed under scalar multiplication. 



EXAMPLE 1 Testing for a Subspace 



In Example 6 of Section 5.1 we verified the ten vector space axioms to show that the points in a plane through the origin of 
p} form a subspace of J? 3 . In light of Theorem 5.2.1 we can see that much of that work was unnecessary; it would have been 
sufficient to verify that the plane is closed under addition and scalar multiplication (Axioms 1 and 6). In Section 5.1 we 
verified those two axioms algebraically; however, they can also be proved geometrically as follows: Let Wbt any plane 
through the origin, and let u and v be any vectors in W. Then u | v must lie in W because it is the diagonal of the 
parallelogram determined by u and v (Figure 5.2.1), and ku must lie in Wfor any scalar k because £u li es on a li ne through u . 
Thus Wis closed under addition and scalar multiplication, so it is a subspace of J? 3 . 



u + v 




Figure 5.2.1 

The vectors u | v an d ku both lie in the same plane as u and v . 



EXAMPLE 2 Lines through the Origin Are Subspaces 



Show that a line through the origin of J? 3 is a subspace of j? 3 . 



Solution 



Let Wbe a line through the origin of j^. It is evident geometrically that the sum of two vectors on this line also lies on the 
line and that a scalar multiple of a vector on the line is on the line as well (Figure 5.2.2). Thus Wis closed under addition and 
scalar multiplication, so it is a subspace of j^ 3 . In the exercises we will ask you to prove this result algebraically using 
parametric equations for the line. 




(a) W h closed under addition. 




lh) W is closed under scalar 

mull i pi lcj Lion, 

Figure 5.2.2 



EXAMPLE 3 Subset of R* That Is Not a Subspace 



Let Wbe the set of all points (x,y) i n R 2 suc h that x > and y > Q. These are the points in the first quadrant. The set Wis 
not a subspace of g2 since it is not closed under scalar multiplication. For example, v = (1, 1) lies in W, but its negative 

( _ l) v = - v = ( - 1, - 1) does not (Figure 5.2.3). 



■ i, h 




Figure 5.2.3 



Wis not closed under scalar multiplication. 



Every nonzero vector space Vhas at least two subspaces: V itself is a subspace, and the set {()} consisting of just the zero 
vector in Vis a subspace called the zero subspace. Combining this with Examples Example 1 and Example 2, we obtain the 
following list of subspaces of R 2 and j? 3 : 



Subspaces of J? 2 



{0} 



Lines through the origin 



R 2 



Subspaces of J? 3 



{0} 



Lines through the origin 



Planes through the origin 



R 1 



Later, we will show that these are the only subspaces of p^ and j£>. 



EXAMPLE 4 Subspaces of M, 



nn 



From Theorem 1.7.2, the sum of two symmetric matrices is symmetric, and a scalar multiple of a symmetric matrix is 
symmetric. Thus the set of M x « symmetric matrices is a subspace of the vector space M nn of a U w x w matrices. Similarly, 
the set of M x « upper triangular matrices, the set of n x « lower triangular matrices, and the set of M x « diagonal matrices all 
form subspaces of M nn , since each of these sets is closed under addition and scalar multiplication. 



EXAMPLE 5 A Subspace of Polynomials of Degree < n 



Let n be a nonnegative integer, and let W consist of all functions expressible in the form 

p(x) = a$ + a\x + »■ + fl H x" 



(1) 



where a ^ ? ...,a n are real numbers. Thus W consists of all real polynomials of degree n or less. The set Wis a subspace of the 
vector space of all real- valued functions discussed in Example 4 of the preceding section. To see this, let/? and q be the 
polynomials 

p{x) =flg + a l* H h a H ;t H and q{x) = b$ + ^1* H h£ H *" 

Then 

(p + q)(x)=^(x)+^Cx) = Cao+io) + C^l+il)^+-+C^+^)^ 
and 

(Ap)OO = £z?00 = (jfcflo) I (^i)^H 1- (£a H )*" 

These functions have the form given in 1, so p I q and £p lie in W. As in Section 4.4, we shall denote the vector space Win 
this example by the symbol p . 



The CMYK Color Model 

Color magazines and books are printed using what is called a CMYK color model. Colors in this model are created 
using four colored inks: (C), (M), (Y), and (K). The colors can be created either by mixing inks of the four types and 
printing with the mixed inks (the spot color method) or by printing dot patterns (called rosettes) with the four colors and 
allowing the reader's eye and perception process to create the desired color combination (the process color method). 
There is a numbering system for commercial inks, called the Pantone Matching System, that assigns every commercial 
ink color a number in accordance with its percentages of cyan, magenta, yellows, and black. Oneway to represent a 
Pantone color is by associating the four base colors with the vectors 



c = (1,0, 0,0) 


(pure cyan) 


m= (0,1,0,0) 


(pure magenta) 


y= (0,0, 1,0) 


(pure yellow) 


k= (0,0, 0,1) 


(pure black) 



in r 4 and describing the ink color as a linear combination of these using coefficients between 
and 1, inclusive. Thus, an ink color p is represented as a linear combination of the form 

where <c 2 ■ < 1- The set of all such linear combinations is called CMYK space, although it is not 
a subspace of r 4 . (Why?) For example, Pantone color 876CVC is a mixture of 38% cyan, 59% 
magenta, 73% yellow, and 7% black; Pantone color 216CVC is a mixture of 0% cyan, 83% 
magenta, 34% yellow, and 47% black; and Pantone color 328CVC is a mixture of 100% cyan, 
0% magenta, 47% yellow, and 30% black. We can denote these colors by 
PS7(5 = (0.33, 0.59, 0.73, 0.07), V2 \6 = (0, 0.33, 0.34, 0.47), and P32S = (l, 0, 0.47, 0.30), respectively. 



EXAMPLE 6 Subspaces of Functions Continuous on ( - 00 , oo ) 



Calculus Required 



Recall from calculus that iff and g are continuous functions on the interval ( _ 00 , do ) and k is a constant, then f | g and 
ki are also continuous. Thus the continuous functions on the interval ( — 00 , do ) form a subspace of ^( _ 00 ? 00 ), since 
they are closed under addition and scalar multiplication. We denote this subspace by C( — 00 , 00 )• Similarly, iff and g 
have continuous first derivatives on ( _ 00 , 00 ), then so do f | g and fcf. Thus the functions with continuous first 
derivatives on ( — 00 , 00 ) form a subspace of p{ _ 00 , 00 )• We denote this subspace by C ( — 00 , do ), where the 
superscript 1 is used to emphasize the first derivative. However, it is a theorem of calculus that every differentiable function 
is continuous, so C ( — 00 , 00 ) is actually a subspace of C( — 00 , 00 )• 

To take this a step further, for each positive integer m, the functions with continuous mth derivatives on ( _ 00 , do ) form a 
subspace of C 1 ( — xj , 00 ) as do the functions that have continuous derivatives of all orders. We denote the subspace of 
functions with continuous mth derivatives on ( _ 00 ? 00 ) by C m ( — 00 , 00 ), and we denote the subspace of functions that 
have continuous derivatives of all orders on ( — 00 , 00 ) by C °° ( — 00 , 00 ) . Finally, it is a theorem of calculus that 
polynomials have continuous derivatives of all orders, so p n is a subspace of C "^ ( — oc , oc ). The hierarchy of subspaces 
discussed in this example is illustrated in Figure 5.2.4. 




Figure 5.2.4 



Remark In the preceding examplewe focused on the interval ( _ oo , oo )• Had we focused on a closed interval [a, b] , 
then the subspaces corresponding to those defined in the example would be denoted by C[a,b], C m [a,b], and C°° [a, b] . 
Similarly, on an open interval (a, b) they would be denoted by C(a, b), C m (a ? £), and C°°(<3 ? £) 



Solution Spaces of Homogeneous Systems 

If Ax = h is a system of linear equations, then each vector x that satisfies this equation is called a solution vector of the 
system. The following theorem shows that the solution vectors of a homogeneous linear system form a vector space, which 
we shall call the solution space of the system. 



THEOREM 5.2.2 




Proof Let Wbe the set of solution vectors. There is at least one vector in W, namely i). To show that Wis closed under 
addition and scalar multiplication, we must show that if x and x' are any solution vectors and k is any scalar, then x + x' and 
fcx are also solution vectors. But if x and x' are solution vectors, then 



from which it follows that 
and 



Ax = and ,4x'=0 
^(x + x'^^-h ,4x'=0 | 0=0 



A(kx)=kAx = kQ = Q 
which proves that x i x' and kx are solution vectors. 



EXAMPLE 7 Solution Spaces That Are Subspaces of fl 3 



Consider the linear systems 



(a) 



1 


-2 3" 


"x" 




"0" 


2 


-4 6 


y 


= 





3 


-6 9 


z 








(b) 



1 -2 3 
-3 7 -8 
-2 4 -6 



r* " 




"0" 


7 


= 





z 








(c) 



1 

-3 
4 



-2 


3 


"x" 




"0" 


7 


-8 


y 


= 





1 


2 


z 








(d) 



"0 0" 


~x~ 




"0" 





y 


= 








z 








Each of these systems has three unknowns, so the solutions form subspaces of p 3 . Geometrically, this means that each 
solution space must be the origin only, a line through the origin, a plane through the origin, or all of p\ We shall now verify 
that this is so (leaving it to the reader to solve the systems). 

Solution 

(a) The solutions are 

x = 2s — 3t, y = s, z = t 
from which it follows that 

x = 2y — 3z, or x — 2y I 3z = 

This is the equation of the plane through the origin with n = (i ? - 2, 3) as a normal vector. 

(b) The solutions are 

x= -5t, y = -£, z = t 
which are parametric equations for the line through the origin parallel to the vector 

Y=(-5, -1,1). 

(c) The solution is x = Q> y = 0, z = Q> so the solution space is the origin only — that is, {Q} . 

(d) The solutions are 



x = r 9 y = s, z = t 
where r, s, and t have arbitrary values, so the solution space is all of r 3 . 



In Section 1.3 we introduced the concept of a linear combination of column vectors. The following definition extends this 
idea to more general vectors. 



DEFINITION 



A vector w is called a linear combination of the vectors V i , \*2, - - -, v r ^ ^ can ^ e expressed in the form 

w = k\v\ + k2^2 + '" + &r Y r 
where k\, *2, ---, k 7 are scalars. 



Remark If r = ] , then the equation in the preceding definition reduces to w = £ lVl ; that is, w is a linear combination of a 
single vector v\ if it is a scalar multiple of v\. 



EXAMPLE 8 Vectors in fl3 Are Linear Combinations of i, j, and k 



Every vector v = (a, b, c) in P? is expressible as a linear combination of the standard basis vectors 

i= (1,0,0). ]= (0,1,0), k= (0,0,1) 
since 

v=(a 9 b,c)=a(l,Q,Q) I A(0, 1,0) I c(0, 0, l)=ai + £j + ck 



EXAMPLE 9 Checking a Linear Combination 



Consider the vectors u = (1, 2, — 1) and v = (6 ? 4, 2) in j? 3 . Show that w = (9 ? 2, 7) is a linear combination of u and v and 
that w' = (4, —1,3) is not a linear combination of u and v. 



Solution 

In order for w to be a linear combination of u and v, there must be scalars k\ and £ 2 such that w = k\u 4 *2v; ^ at * s ' 

(9,2,7)=*i(l,2, -1) I * 2 (6,4,2) 
or 

(9, 2, 7) = (M + 6*2, 2*i I 4*2, - *i + 2* 2 ) 
Equating corresponding components gives 

*! + 6*2 = ^ 
2*i +4*2 = 2 
-*l I 2*2 = 7 
Solving this system using Gaussian elimination yields k\= — 3> *2 = 2> so 

w = — 3u 4- 2v 



Similarly, for w' to be a linear combination of u and v, there must be scalars k\ and £ 2 suc h that w' = k\\\ 4- *2v; that is, 

(4, -l,3)=*i(l,2, -1) I £ 2 (6,4, 2) 
or 

(4, -l,8) = (*i I 6Jt2, 2*i I 4*2, -*i I 2*2) 
Equating corresponding components gives 

k { + 6*2 = 4 
2*i I 4*2 = - 1 
-*l I 2*2 = 3 

This system of equations is inconsistent (verify), so no such scalars ^ and £ 2 exist. Consequently, w' is not a linear 
combination of u and v. 

Spanning 

If vi, V2, ---, v? are vectors in a vector space V 9 then generally some vectors in V may be linear combinations of y 1? V2 ? ..., v 7 
and others may not. The following theorem shows that if we construct a set W consisting of all those vectors that are 
expressible as linear combinations of y 1? y 2? .._, y r then W forms a subspace of V. 

THEOREM 5,2.3 



tfv{, V2, ---, v r are vectors in a vector space V, then 

(a) The set W of all linear combinations tf/V 1? V2 ? --. ? v r is a subspace ofV. 



(b) W is the smallest subspace of V that contains y 1 ? y 2? . . _ ? v r in the sense that every other subspace of V that 
contains y 1? y 2? ___ ? v r must contain W. 



Proof (a) To show that Wis a subspace of V, we must prove that it is closed under addition and scalar multiplication. There 
is at least one vector in W — namely 0, since Q = Qvi \- Qv2 H h 0\>- If u an d v are vectors in W, then 

u = civi + C2Y2 + - + Cj-Yr 

and 

v = *ivi + *2V 2 H h k r v r 

where Cu c 2 , ---, ^, *i, *2, --, ^ are scalars. Therefore, 

u + v = (c\ +*i)vi 4- (C2 I *2)V2H h(c r + * r )\> 

and, for any scalar k FF 

*u = {kc\)\\ 4 (*C2) V 2H V (kc r )v r 

Thus u 1 v and *u are linear combinations of vi, v 2 , ..., v r and consequently lie in l/l/. Therefore, 1/1/ is 
closed under addition and scalar multiplication. 



Proof (b) Each vector y i is a linear combination of Vl , ¥2, - - - ? v r since we can write 

Vj = 0vi I 0v 2 + ■■■ + 1y 3 +-+0v, 
Therefore, the subspace 1/1/ contains each of the vectors vi, v 2? ..., v r Let w* be any other subspace 
that contains v i? v 2? ..., v r Since IF' is closed under addition and scalar multiplication, it must contain 

all linear combinations of V1 , Y2,.~ , v r Thus, iv f contains each vector of l/l/. 

■ 

We make the following definition. 



DEFINITION 



If S = { vi , V2, - - -, v r } is a set of vectors in a vector space V, then the subspace W of V consisting of all linear 
combinations of the vectors in S is called the space spanned by y 1? y 2? .__, y r and we say that the vectors y 1? y 2? 
span W. To indicate that Wis the space spanned by the vectors in the set S= (y 1? y 2 , ---, v r ) > we write 

W=span(£T) or W= span{vi ? V2 ? --., v r } 



v, 



EXAMPLE 10 Spaces Spanned by One or Two Vectors 



If vi and V2 are noncollinear vectors in j? 3 with their initial points at the origin, then span {y 1? y 2 ) , which consists of all 
linear combinations jtj v\ 4- ^2 V 2' * s ^ e P^ ane determined by vi and y 2 (see Figure 5.2.5a). Similarly, if v is a nonzero vector 
in j? 2 or j? 3 , then span (v) , which is the set of all scalar multiples £v, is the line determined by v (see Figure 5.2.5b). 



ft pV | + t 3 v : 




spun [V|y%| 



(a) Span |V|. v 2 J is (he plan*: through the 
origin determined by v | and v 2 . 

Figure 5.2.5 




s|ijiii h I 



(/j) SpEin j v \ is the line through the 
origin determined by v. 



EXAMPLE 1 1 Spanning Set for p n 



The polynomials I,*,*,...,*" span the vector space p n defined in Example 5 since each polynomial/; in p n can be written 
as 

p = flg + a l x H 1- a n^ 

which is a linear combination of 1, x, x , ..., x n . We can denote this by writing 

P„ = span {!,*,* ,...,*"} 



EXAMPLE 12 Three Vectors That Do Not Span fl3 



Determine whether ¥l = (] ? 1 ? 2), V2 = (1, 0, 1), and y 3 = (2, 1, 3) s P an the vector space g 3 . 

Solution 

We must determine whether an arbitrary vector b = (b\, b 2 , ^3) i n i? 3 can be expressed as a linear combination 

b = jfcjvi 4 ^2 V 2 I ^3 V 3 
of the vectors yj, V2, and V3. Expressing this equation in terms of components gives 

(b u b 2 ,b 3 )=k l (\,\,2) h* 2 a0,l) 1^(2,1,3) 
or 

(b u h., b<fi = (k] + h. + 2H k] + H 2k] +k?. + 3k^) 
or 

k\ + k 2 + 2^3 = b\ 

k\ + £3 = ^2 

2k \ 4- k 2 + 3^3 = £3 

The problem thus reduces to determining whether this system is consistent for all values of £ 1? £ 2 > an d by By parts (e) and 
(g) of Theorem 4.3.4, this system is consistent for all ^ 1 , £ 2 , and £ 3 if and only if the coefficient matrix 



"1 


1 


2" 


1 





1 


2 


1 


3 



,4 = 



has a nonzero determinant. However, det(j4) = (verify), so v\, \ r 2> and ¥3 do not span j? 3 . 

Spanning sets are not unique. For example, any two noncollinear vectors that lie in the plane shown in Figure 5.2.5 will span 
that same plane, and any nonzero vector on the line in that figure will span the same line. We leave the proof of the 
following useful theorem as an exercise. 



THEOREM 5.2.4 



IfS= {v\, V2, -.-, Vj.} and S f = {w\, \y 2? ---, w^} are two sets of vectors in a vector space V, then 

span {vi, v 2 , .., vjt) = span {wi, w 2 , ..., w k ) 

if and only if each vector in S is a linear combination of those in s* and each vector in s* is a 



linear combination of those in S. 



Exercise Set 5.2 



Click here for Just Ask! 



Use Theorem 5.2.1 to determine which of the following are subspaces of gl. 
1. 

(a) all vectors of the form (a, 0,0) 

(b) all vectors of the form ( a? 1 ? 1) 

(c) all vectors of the form (a,b,c), where b =a+c 

(d) all vectors of the form (a,h 9 c) 9 where b = a + c + 1 

(e) all vectors of the form (a,b, 0) 



a 
•a 



Use Theorem 5.2.1 to determine which of the following are subspaces of Ma 
2. 

(a) all 2 x 2 matrices with integer entries 

(b) all matrices 

a b 
c d 

where a + b + c + d = 

(c) all 2 x 2 matrices A such that det(j4) = 

(d) all matrices of the form 

a b 
_0 c 

(e) all matrices of the form 

I" a a 

[ — a — i 

Use Theorem 5.2.1 to determine which of the following are subspaces of p*. 
3. 

(a) all polynomials flQ \ ai x \ a 2 x 2 I a 3 x 3 for which flQ = Q 

(b) all polynomials flQ , a{X , ^ , ^ for which flQ + a{ +a2 + a 3 = 

(c) all polynomials flQ + a{X + a2 x 2 + a 3 x 3 for which a^a h a 2 , and a 3 are integers 

(d) all polynomials of the form a$ I a\x 9 where a an d a \ are real numbers 

Use Theorem 5.2.1 to determine which of the following are subspaces of the space p( — do ? do ) 
4. 

(a) all /such that f ( x ) < for all x 

(b) all /such that / (0) = 

(c) all /such that / (0) = 2 



(d) all constant functions 



(e) all /of the form £j | ^sin x> where £j and £ 2 are rea l numbers 



Use Theorem 5.2.1 to determine which of the following are subspaces of M 



MM 



(a) all n x n matrices A such that tr(A) = 



(b) all n x n matrices A such that A T = —A 



(c) all ?i x n matrices A such that the linear system Ax — has only the trivial solution 



(d) all n x n matrices A such that AB = BA for a fixed « x « matrix B 



Determine whether the solution space of the system Ax — is a line through the origin, a plane through the origin, or the 
origin only. If it is a plane, find an equation for it; if it is a line, find parametric equations for it. 



(a) 



A = 



-11 1 
3 -10 
2 -4 -5 



(b) 



A = 



1 -2 3 
-3 6 9 
-2 4 -6 



(c) 



,4 = 



"1 


2 


3" 


2 


5 


3 


1 





8 



(d) 



A = 



1 2 -6 
1 4 4 
3 10 6 



(e) 



,4 = 



1 


-1 1 " 


2 


-1 4 


3 


1 11 



(f) 



,4 = 



1 


-3 r 


2 


-6 2 


3 


-9 3 



7. 



9. 



Which of the following are linear combinations of u = (0, — 2,2) and v = ( 1 , 3, — 1 ) ? 

(a) (2,2,2) 

(b) (3, 1, 5) 

(c) (0,4,5) 

(d) (0,0,0) 

Express the following as linear combinations of u = (2, 1, 4). v = (1, —1,3), and w = (3 n 2, 5)- 

(a) (-9,-7,-15) 

(b) (6,11,6) 

(c) (0,0,0) 

(d) (7, 8, 9) 

Express the following as linear combinations of pj = 2 + x + 4x 2 > P2 = 1 — x 4- 3x 2 > and p3 = 3 4- 2x 4 5x 2 - 

(a) _9_7x- 15x 2 

(b) 6 I llx + 6* 2 

(c) 

(d) 7 4- 8x 4 9x 2 



10. 



Which of the following are linear combinations of 



A = 



4 
-2 -2 



5 = 



1 -1 

2 3 



C = 



2 

1 4 



(a) 



6 -8 
-1 -8 



(b) 



(c) 



(d) 








0_ 




6 0" 




3 8_ 




-1 5 


7 


1 



11. 



In each part, determine whether the given vectors span ^3. 



(a) vi = (2, 2, 2), v 2 = (0, 0, 3), v 3 = (0, 1, 1) 



(b) vi = (2, - 1, 3), v 2 = (4, 1, 2), v 3 = (8, - 1, 8) 



(c) vi = (3, 1, 4), v 2 = (2, - 3, 5), v 3 = (5, - 2, 9), v 4 (l, 4,-1) 



(d) vi = (1, 2, 6), v 2 = (3, 4, 1), v 3 = (4, 3, 1), v 4 (3, 3, 1) 



12. 



Let f — cos 2 * and g = sin x. Which of the following lie in the space spanned by/ and gl 



(a) cos2* 



(b) 


34 


x 2 


(c) 


1 




(d) 


sin 


X 


(e) 








13. 



Determine whether the following polynomials span p^ 

pi = 1 — x 4 2x , p 2 = 3 4 x, P3 = 5 — x I Ax , p 4 = — 2 — 2x I 2x 



14. 



Let Vl = (2, 1, 0, 3), v 2 = (3, - 1, 5, 2), and V3 = ( - 1, 0, 2, !)• Which of the following vectors are in 



,. .„ .„i ? 



span { vi, v 2 , v 3 } 



(a) (2,3,-7,3) 

(b) (0,0,0,0) 

(c) (1, 1, 1, 1) 

(d) (-4,6,-13,4) 



Find an equation for the plane spanned by the vectors u = ( — 1, 1, 1) and v = (3, 4, 4)- 

Find parametric equations for the line spanned by the vector u = (3, — 2, 5)- 
16. 

Show that the solution vectors of a consistent nonhomogeneous system of m linear equations in n unknowns do not form 
17. a subspace of ^ H . 

Prove Theorem 5.2.4. 
18. 

Use Theorem 5.2.4 to show that vi = ( 1,6, 4), v 2 = (2, 4, - 1), v 3 = ( - 1, 2, 5), and Wi = (1, -2, -5), 
!"• W2 = (0, 3, 9) span the same subspace of J? 3 . 

A line L through the origin in g 3 can be represented by parametric equations of the form x = a£> y = b£, and z = ct- Use 
20 * these equations to show that L is a subspace of j? 3 ; that is, show that if V j = (x \ , y \ , z\ ) and y 2 = (72, 72> z 2) are P°i nts 
on L and k is any real number, then kv\ and vj + V2 are also points on L. 

21. (For Readers Who Have Studied Calculus) Show that the following sets of functions are subspaces of p{ _ oo , do ) 



(a) all everywhere continuous functions 

(b) all everywhere continuous functions 

(c) all everywhere continuous functions that satisfy f ' I 2f = 



22. (For Readers Who Have Studied Calculus) Show that the set of continuous functions f = f ( x ) on [ fl> b] such that 

■b 



f f(x)dx = 
is a subspace of C\a, h] ■ 



Discussion 

Indicate whether each statement is always true or sometimes false. Justify your answer by 
23. giving a logical argument or a counterexample. 



(a) If Ax = b is any consistent linar system of m equations in n unknowns, then the 
solution set is a subspace of R n . 



(b) If W is a set of one or more vectors from a vector space V, and if k\\ I v is a vector in 
Wfor all vectors u and v in Wand for all scalars &, then Wis a subspace of V. 



(c) If S is a finite set of vectors in a vector space V, then span(S) must be closed under 
addition and scalar multiplication. 



(d) The intersection of two subspaces of a vector space V is also a subspace of V. 

(e) If span(^i) = span (£2)' then ^1 = ^2* 



24. 



(a) Under what conditions will two vectors in pi span a plane? A line? 

(b) Under what conditions will it be true that span {11} = span (v) ? Explain. 



(c) If Ax = h is a consistent system of m equations in n unknowns, under what conditions 
will it be true that the solution set is a subspace of R^l Explain. 



Recall that lines through the origin are subspaces of p}. If W\ is the line y = x ffl 2 i s ^e line 
25- y _ _ X 9 is the union f^ T 1 u f^2 a subspace of j? 2 ? Explain your reasoning. 



26. 

(a) Let ^22 ^ e th e vector space of 2 x 2 matrices. Find four matrices that span i^- 



(b) In words, describe a set of matrices that spans M 



MH 



We showed in Example 8 that the vectors i, j, k span j^. However, spanning sets are not 
27 • unique. What geometric property must a set of three vectors in R 3 have if they are to span ^ 3 ? 
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5.3 

LINEAR INDEPENDENCE 



In the preceding section we learned that a set of vectors s = {v-i , v 2 , . . . , vj 
spans a given vector space V if every vector in V is expressible as a linear 
combination of the vectors in S. In general, there may be more than one way to 
express a vector in V as a linear combination of vectors in a spanning set. In 
this section we shall study conditions under which each vector in V is 
expressible as a linear combination of the spanning vectors in exactly one way. 
Spanning sets with this property play a fundamental role in the study of vector 
spaces. 



DEFINITION 



If S = { vi , ¥2, - - -, Y r ) * s a nonem Pty set of vectors, then the vector equation 

k\Y\ + ^2 V 2 H 1- &r Y r = 

has at least one solution, namely 

If this is the only solution, then S is called a linearly independent set. If there are other solutions, then S is called a linearly 
dependent set. 



EXAMPLE 1 A Linearly Dependent Set 



If vi = (2, - 1, 0, 3), v 2 = (1, 2, 5, - 1), and V3 = (7, - 1, 5, 8), then the set of vectors S= {v h v 2 , v 3 ) is linearly dependent, 
since 3 V1 | V2 _ V3 =0- 



EXAMPLE 2 A Linearly Dependent Set 



The polynomials 

pi = 1 — x r P2 = 5 + 3x — 2x^ 
form a linearly dependent set in p 2 since 3pi — p2 i 2p3 = 0- 



and P3 = 1 4- 3x — x^ 



EXAMPLE 3 Linearly Independent Sets 



Consider the vectors i=(l ? 0,0)>j = (0, 1,0), and £=(0 ? ? l)inj? 3 - In terms of components, the vector equation 
becomes 



*l(l,0,0) I ^ 2 (0, 1, 0) I * 3 (0, 0,1) = (0.0.0) 
or, equivalently, 

(k u k 2 ,k 3 ) = (0,0,0) 

This implies that ^ — 0, £ 2 = 0? and k 3 = 0? so the set J?= {i, j, k) is linearly independent. A similar argument can be used to 
show that the vectors 

ei - (1, 0, 0, ..., 0), e 2 - (0, 1. 0, ..., 0)..., e H - (0, 0, 0, ..., 1) 
form a linearly independent set in R n . 



EXAMPLE 4 Determining Linear Independence/Dependence 



Determine whether the vectors 

vi = (1, - 2, 3), v 2 = (5, 6, - I), v 3 = (3, 2, 1) 
form a linearly dependent set or a linearly independent set. 

Solution 

In terms of components, the vector equation 

jfcivi I £2 V 2 I ^3 V 3 —0 

becomes 

*i (1, - 2, 3) i k 7 (5, 6, - 1) I k^(3, 2, 1) = (0, 0, 0) 

or, equivalently, 

(jfcl 4- 5k 2 H 3^3, - 2ii I 6k 2 I 2k 3 , 3k 1 -k 2 I k 3 ) = (0, 0, 0) 

Equating corresponding components gives 

jfcl + 5k 2 + 3^ 3 = 

-2k { + 6k 2 I 2i 3 = 

3ki-k 2 + k 3 = 

Thus vi, V2, and V3 form a linearly dependent set if this system has a nontrivial solution, or a linearly independent set if it has 
only the trivial solution. Solving this system using Gaussian elimination yields 

Thus the system has nontrivial solutions and vi, V2, and V2 form a linearly dependent set. Alternatively, we could show the 
existence of nontrivial solutions without solving the system by showing that the coefficient matrix has determinant zero and 
consequently is not invertible (verify). 



EXAMPLE 5 Linearly Independent Set in p n 



Show that the polynomials 

form a linearly independent set of vectors in p . 

Solution 



*-, x, x , ..., x 



Let p — 1 ? p 1 — x, p2 = x 2 , ._., p H = x" and assume that some linear combination of these polynomials is zero, say 

fl fi P n + aiPi + a? V 7 + - + a» P » = 
or, equivalently, 

(3g + (3ix + (32^ H \-a n x n = forall;rin( — oo , do ) q\ 

We must show that 

flg =dfi = #2 — ■" — a n = 
To see that this is so, recall from algebra that a nonzero polynomial of degree n has at most n distinct roots. But this implies that 
ag = a\ = (32 = "■ = #h — 0' otherwise, it would follow from 1 that flQ _|_ a ^ x _| fl2 ;r 2 _|_ ... _|_ a ^ n is a nonzero polynomial with 
infinitely many roots. 

4 

The term linearly dependent suggests that the vectors "depend" on each other in some way. The following theorem shows that 
this is in fact the case. 

THEOREM 5.3.1 



A set S with two or more vectors is 

(a) Linearly dependent if and only if at least one of the vectors in S is expressible as a linear combination of the other 
vectors in S. 



(b) Linearly independent if and only if no vector in S is expressible as a linear combination of the other vectors in S. 



We shall prove part (a) and leave the proof of part (b) as an exercise. 



Proof (a) Let S= { vi , Y2, - - -, v r } be a set with two or more vectors. If we assume that S is linearly dependent, then there are 
scalars k\, £2, .., k r not all zero, such that 

To be specific, suppose that k\ #0- Then 2 can be rewritten as 



... ,.^ + . + _i 



which expresses vi as a linear combination of the other vectors in S. Similarly, if kj * in 2 for some 
j = 2, 3, .„, r, then vj is expressible as a linear combination of the other vectors in S. 
Conversely, let us assume that at least one of the vectors in S is expressible as a linear combination of the other vectors. To be 
specific, suppose that 

so 

v l - C 2 V 2 - C 3 V 3 c r Y r = 

It follows that S is linearly dependent since the equation 

k\v\ + ^2 V 2 H 1- ^ v r = 

is satisfied by 



which are not all zero. The proof in the case where some vector other than vi is expressible as a linear combination of the other 
vectors in S is similar. 



EXAMPLE 6 Example 1 Revisited 



In Example 1 we saw that the vectors 

vi - (2, - 1, 0, 3), v 2 = (1, 2, 5, - 1), and v 3 = (7, - 1, 5, 8) 

form a linearly dependent set. It follows from Theorem 5.3.1 that at least one of these vectors is expressible as a linear 
combination of the other two. In this example each vector is expressible as a linear combination of the other two since it follows 
from the equation 3 Vl + v 2 — V3 = ( see Example 1) that 

vi = — — V2 I ^V3, V2 = — 3vi I V3, and V3 = 3vi I v 2 



EXAMPLE 7 Example 3 Revisited 



In Example 3 we saw that the vectors i=(l ? 0,0),j=(0, 1,0)> and k = (0, 0, 1) f° rm a linearly independent set. Thus it follows 
from Theorem 5.3.1 that none of these vectors is expressible as a linear combination of the other two. To see directly that this is 
so, suppose that k is expressible as 

Then, in terms of components, 

(0,0, 1) =£i(l,0, 0) h£ 2 (0, 1,0) or (0,0, l) = (*i,£ 2 ,0) 

But the last equation is not satisfied by any values of ^ and £ 2 > so k cannot be expressed as a linear combination of i andy. 
Similarly, i is not expressible as a linear combination of j and A:, andy is not expressible as a linear combination of i and k. 

The following theorem gives two simple facts about linear independence that are important to know. 
THEOREM 5.3.2 



(a) A finite set of vectors that contains the zero vector is linearly dependent. 

(b) A set with exactly two vectors is linearly independent if and only if neither vector is a scalar multiple of the other. 



We shall prove part (a) and leave the proof of part (b) as an exercise. 

Proof (a) For any vectors ¥l? v 2 , ..., v r the set j$= (v^, v 2 , ..., v r , 0} is linearly dependent since the equation 

0V1 + 0V9 + - + OVr + U0) = 



expresses o as a linear combination of the vectors in S with coefficients that are not all zero. 



EXAMPLE 8 Using Theorem 5.3.26 



The functions f j — x and f 2 = sin x f° rm a linearly independent set of vectors in p{ _ oo , oo )» since neither function is a 
constant multiple of the other. 



Geometric Interpretation of Linear Independence 

Linear independence has some useful geometric interpretations in j? 2 and ^ 3 : 

* 
In R 2 or j? 3 , a set of two vectors is linearly independent if and only if the vectors do not lie on the same line when they are 
placed with their initial points at the origin (Figure 5.3.1). 




(a) Linearly dependent 

Figure 5.3.1 



th) Linearly dependent 



(t) Linearly independent 



In j? 3 , a set of three vectors is linearly independent if and only if the vectors do not lie in the same plane when they are 
placed with their initial points at the origin (Figure 5.3.2). 





(a) Linearly dependent 
Figure 5.3.2 



{hi Linearly dependent 



U-) Linearly independent 



The first result follows from the fact that two vectors are linearly independent if and only if neither vector is a scalar multiple of 
the other. Geometrically, this is equivalent to stating that the vectors do not lie on the same line when they are positioned with 
their initial points at the origin. 

The second result follows from the fact that three vectors are linearly independent if and only if none of the vectors is a linear 
combination of the other two. Geometrically, this is equivalent to stating that none of the vectors lies in the same plane as the 



other two, or, alternatively, that the three vectors do not lie in a common plane when they are positioned with their initial points 
at the origin (why?). 

The next theorem shows that a linearly independent set in R n can contain at most n vectors. 
THEOREM 5.3.3 




Proof Suppose that 

vi = Oll,vi2,.-,vi„) 
V2=(v21,v 2 2,---, v 2h ) 

Consider the equation 

k\Y\ 4- ^2 V 2 H 1- &r Y r = 

If, as illustrated in Example 4, we express both sides of this equation in terms of components and 
then equate corresponding components, we obtain the system 

vnJti + V2i^2 H 1- v r \k r = 

^12^1 "f v 22^2 I \-v r 2k r =0 

vi„jti I v 2h ^2 I \-v rn k r = 

This is a homogeneous system of n equations in the r unknowns k\,...,k r - Since r>n, it follows from 
Theorem 1.2.1 that the system has nontrivial solutions. Therefore, s= {v\, y 2r ..., y r ) is a linearly 
dependent set. 



Remark The preceding theorem tells us that a set in r 2 with more than two vectors is linearly dependent and a set in p^ with 
more than three vectors is linearly dependent. 

Linear Independence of Functions 

Sometimes linear dependence of functions can be deduced from known identities. For example, the functions 
Calculus Required 



2 2 

fi=sin;r, f 7 = cos x, and f ? = 5 

form a linearly dependent set in ^ _ oo , do )» since the equation 

5fi4 5f 2 -f3 = 5sin 2 * I 5cos 2 x - 5 = 5(sin 2 * I cos 2 *) -5 = 

expresses as a linear combination of f j, f 2 , and f 3 with coefficients that are not all zero. However, it is only in special 
situations that such identities can be applied. Although there is no general method that can be used to establish linear 
independence or linear dependence of functions in f( _ 00 , 00 )» we shall now develop a theorem that can sometimes be used to 
show that a given set of functions is linearly independent. 

If f 1 = f i(x) , f 2 = f 2(x )>---> f h = / h 00 anc * h — 1 times differentiable functions on the interval ( — 00 , do ) » then the 
determinant 



W(x) = 



/'W /2W ■■■ /to 



/^W^W?" 1 '**) 



is called the Wronskian of y 1? y 2? ___ ? y H . As we shall now show, this determinant is useful for ascertaining whether the 
functions f 1? f 2? ___ ? f M form a linearly independent set of vectors in the vector space £>" ^( — 00 , 00 ). 

Suppose, for the moment, that f 1? f 2? _._, f M are linearly dependent vectors in C^"~ \ — 00 , 00 ). Then there exist scalars 
ii, Ar2, ---, jt M ' no ' aW z ^ r<9 ' such that 

ti/ 1 00 + ^./^(*) + »■ + * M ./ M oo = 

for all x in the interval ( _ 00 , 00 ) • Combining this equation with the equations obtained by n _ 1 successive differentiations 
yields 



*i/i 00 1 £2/2 00 



■ + *„/? 15 w = o 



Thus, the linear dependence of f 1? f 2? ___ ? f H implies that the linear system 

has a nontrivial solution for every x i n the interval ( _ 00 ? 00 ) • This implies in turn that for every x in ( _ 00 , 00 ) the 
coefficient matrix is not invertible, or, equivalently, that its determinant (the Wronskian) is zero for every x in ( _ -yo , 00 ) 
Thus, if the Wronskian is not identically zero on ( _ 00 , 00 ) » then the functions f 1 r f 2 
vectors in C^ H_ ^( — 00 , 00 ). This is the content of the following theorem. 



r*i" 




"0" 


*2 


= 





*- 








f must be linearly independent 




Jozef Maria Hoene-Wronski 



Jozef Maria Hoene-Wronski (1776-1853) was a Polish-French mathematician and philosopher. Wronski received his early 
education in Poznan and Warsaw. He served as an artillery officer in the Prussian army in a national uprising in 1794, was 



taken prisoner by the Russian army, and on his release studied philosophy at various German universities. He became a French 
citizen in 1800 and eventually settled in Paris, where he did research in analysis leading to some controversial mathematical 
papers and relatedly to a famous court trial over financial matters. Several years thereafter, his proposed research on the 
determination of longitude at sea was rebuffed by the British Board of Longitude, and Wronski turned to studies in Messianic 
philosophy. In the 1830s he investigated the feasibility of caterpillar vehicles to compete with trains, with no luck, and spent 
his last years in poverty. Much of his mathematical work was fraught with errors and imprecision, but it often contained 
valuable isolated results and ideas. Some writers attribute this lifelong pattern of argumentation to psychopathic tendencies 
and to an exaggeration of the importance of his own work. 



THEOREM 5.3.4 



If the functions f 1? f 2? ___, f n have n — \ continuous derivatives on the interval ( _ oo , oo )> and if the Wronskian of these 
functions is not identically zero on ( — oo , oo )> then these functions form a linearly independent set of vectors in 

C^- 1 ^- DO, OO). 



EXAMPLE 9 Linearly Independent Set inC (- oo, oo) 



Show that the functions f 1 = x and f 2 = sin x f° rm a linearly independent set of vectors in C ( — oo , oo ) . 



Solution 

In Example 8 we showed that these vectors form a linearly independent set by noting that neither vector is a scalar multiple of the 
other. However, for illustrative purposes, we shall obtain this same result using Theorem 5.3.4. The Wronskian is 



W{x) = 



x sin x 
1 cos x 



= xcq$ x — sin x 



This function does not have value zero for all x in the interval ( _ oo r oo ), as can be seen by evaluating it at x = ir / 2* so f j and 
f 2 form a linearly independent set. 



EXAMPLE 10 Linearly Independent Set in C ( - oo , oo ) 



Show that f 1 — l, f 2 = € x 9 and f 3 — e 2x form a linearly independent set of vectors in C ( — oo, oo). 



Solution 

The Wronskian is 



1 e x e 2 * 
0e*4e 2 * 



W(x) = 
This function does not have value zero for all x (in fact, for any x) in the interval ( _ oo , oo ), so f 1? f 2 , and f 3 form a linearly 



= 2e 



3* 



independent set. 



Remark The converse of Theorem 5.3.4 is false. If the Wronskian of f 1? f 2? ___, f H is identically zero on ( _ -xj ? oo )> then no 
conclusion can be reached about the linear independence of {f 1? f 2? ... ? f„}; this set of vectors may be linearly independent or 
linearly dependent. 



Exercise Set 5.3 



@ 



Click here for Just Ask! 



Explain why the following are linearly dependent sets of vectors. (Solve this problem by inspection.) 



(a) ui = ( - 1, 2, 4) and „ 2 = (5, - 10, - 20) in p} 



(b) ui = (3, -l)'U 2 =(4,5),u 3 = (-4,7)in J S 2 



(c) 1)1 = 3-2* I x 2 and p2 = 6 - Ax I 2x 2 in P 2 



< d > A = 



-3 4 
2 



and 5 = 



3 
-2 



in 



M 2 2 



2. 



Which of the following sets of vectors in p^ are linearly dependent? 



(a) (4, -1,2), (-4, 10,2) 



(b) (-3, 0,4), (5, -1,2), (1,1, 3) 



(c) (8, - 1, 3), (4, 0, 1) 



(d) ( - 2, 0, 1), (3, 2, 5), (6, - 1, 1), (7, 0, - 2) 



Which of the following sets of vectors in p 4 are linearly dependent? 



(a) (3, 8, 7, - 3), (1, 5, 3, - 1), (2, - 1, 2, 6), (1, 4, 0, 3) 



(b) (0,0,2,2), (3,3,0,0), (1,1,0, - 1) 



(c) (0,3, -3, -6), (-2,0,0, -6),(0, -4, -2, -2), (0, -8,4, -4) 

(d) (3, 0, -3,6), (0, 2, 3, 1), (0, - 2, - 2, 0), ( - 2, 1, 2, 1) 

Which of the following sets of vectors in p-, are linearly dependent? 
4. 

(a) 2-x I 4x 2 , 3 I 6x I 2x 2 , 2 I 10* -4x 2 

(b) 3-i-jc + jr 2 , 2-x I 5x 2 ,4-3* 2 

(c) e- x 2 ,\ + x + 4x 2 

(d) \ + 3x+3x 2 ,x I 4* 2 , 5 + 6x + 3x 2 , 7 I 2*-x 2 

Assume that v\, \ r 2, and V3 are vectors in R 3 that have their initial points at the origin. In each part, determine whether the 
*' three vectors lie in a plane. 

(a) vi = (2, -2, 0),v 2 = (6, 1, 4), v 3 = (2, 0, -4) 

(b) vi = ( - 6, 7, 2), v 2 = (3, 2, 4), v 3 = (4, - 1, 2) 

Assume that v\, \ r 2, and v 3 are vectors in j? 3 that have their initial points at the origin. In each part, determine whether the 
"• three vectors lie on the same line. 

(a) vi = (-l,2,3),v 2 = (2, -A, - 6), v 3 = ( - 3, 6, 0) 

(b) vi = (2, - 1, 4), v 2 = (4, 2, 3), v 3 = (2, 1, - 6) 

(c) vi = (4,6,8),v 2 = (2,3,4),v 3 = (-2, -3,-4) 



7. 

(a) Show that the vectors V j = (0, 3, 1, — 1)» v 2 = (6, 0, 5, 1), and V3 = (4, — 7, 1, 3) form a linearly dependent set in ^4 



(b) Express each vector as a linear combination of the other two. 



8. 



(a) Show that the vectors vi = (1,2, 3, A), \*2 = (0, 1, 0, — 1)> an d V3 = (1, 3, 3, 3). form a linearly dependent set in j? 4 . 



(b) Express each vector as a linear combination of the other two. 



9. 



For which real values of A do the following vectors form a linearly dependent set in p^l 



vi 



(x-i-i 



v 2 \-\.K-\), V3=(-I-1,A 



Show that if (y 1? y 2? V3} is a linearly independent set of vectors, then so are (vi,V2}> (vi,V3} ? {v2 ? V3), (vi), {V2} > 
10 - and {y 3 }. 



11. 



Show that if £" = {vi, V2, --, v r ) is a linearly independent set of vectors, then so is every nonempty subset of S. 



Show that if (y 1? y 2? y 3 ) is a linearly dependent set of vectors in a vector space V, and v^ is any vector in V, then 
( v l> v 2> v 3> v 4) * s a ^ so li near ly dependent. 

Show that if (y 1? y 2? ... r v r ) is a linearly dependent set of vectors in a vector space V, and if v r+ i, ..., v H are any vectors in 
"• V, then (y 1? V2, ---, v r , v P +i, ..., v H } is also linearly dependent. 



14. 



Show that every set with more than three vectors from p 2 is linearly dependent. 



15. 



Show that if (vi, V2} is linearly independent and ¥3 does not lie in span (y 1? y 2 } , then (y 1? y 2? y 3 ) is linearly independent. 



16. 



Prove: For any vectors w, v, and w, the vectors u _ v , v — w> and w _ u form a linearly dependent set. 



17. 



Prove: The space spanned by two vectors in p^ is a line through the origin, a plane through the origin, or the origin itself. 



18. 



Under what conditions is a set with one vector linearly independent? 



Are the vectors v\ 9 \'2> and V3 in part (a) of the accompanying figure linearly independent? What about those in part (&)? 
19. Explain. 





la) 



ih) 



Figure Ex-19 

Use appropriate identities, where required, to determine which of the following sets of vectors in f( _ do , oc« ) are linearly 
20- dependent. 

(a) 6, 3sin 2 *, 2cos 2 * 

(b) x, cos x 

(c) 1, sin x 7 sin2x 

(d) cos2x ? sin *, cos x 

(e) (3-^) 2 ? ^ 2 -6^ ? 5 

(f) 0, cos kx, sin 3^7: 



21. (For Readers Who Have Studied Calculus) Use the Wronskian to show that the following sets of vectors are linearly 
independent. 

(a) l,x 7 e x 

(b) sin x 7 cos x 7 x sin x 

(c) e x ,xe\x 2 e x 

(d) l,*,* 2 



Use part (a) of Theorem 5.3.1 to prove part (b). 
22. 



Prove part (b) of Theorem 5.3.2. 
23. 



Discussion 

Discoverv Indicate whether each statement is always true or sometimes false. Justify your answer by giving a 

24. logical argument or a counterexample. 



(a) The set of 2 x 2 matrices that contain exactly two l's and two O's is a linearly independent 
set in ^22* 



(b) If (y 1? v 2 } is a linearly dependent set, then each vector is a scalar multiple of the other. 



(c) If {vi, V2, V3} is a linearly independent set, then so is the set {kv\, kv2, kv^} f° r every 
nonzero scalar k. 



(d) The converse of Theorem 5.3.2a is also true. 



Show that if (y 1? v 2? y 3 ) is a linearly dependent set with nonzero vectors, then each vector in the 
25- set is expressible as a linear combination of the other two. 

Theorem 5.3.3 implies that four nonzero vectors in p^ must be linearly dependent. Give an 
2 "* informal geometric argument to explain this result. 



27. 

(a) In Example 3 we showed that the mutually orthogonal vectors i, j, and t form a linearly 
independent set of vectors in ^ 3 . Do you think that every set of three nonzero mutually 
orthogonal vectors in $} is linearly independent? Justify your conclusion with a geometric 
argument. 

(b) Justify your conclusion with an algebraic argument. 
Hint Use dot products. 
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5.4 

BASIS AND DIMENSION 



We usually think of a line as being one-dimensional, a plane as 
two-dimensional, and the space around us as three-dimensional. It is the 
primary purpose of this section to make this intuitive notion of "dimension" more 
precise. 



Nonrectangular Coordinate Systems 

In plane analytic geometry we learned to associate a point P in the plane with a pair of coordinates ( a? b) by projecting P onto a 
pair of perpendicular coordinate axes (Figure 5.4.1a). By this process, each point in the plane is assigned a unique set of 
coordinates, and conversely, each pair of coordinates is associated with a unique point in the plane. We describe this by saying that 
the coordinate system establishes a one-to-one correspondence between points in the plane and ordered pairs of real numbers. 
Although perpendicular coordinate axes are the most common, any two nonparallel lines can be used to define a coordinate system 
in the plane. For example, in Figure 5.4. lb, we have attached a pair of coordinates ( a? b) to the point P by projecting P parallel to 
the nonperpendicular coordinate axes. Similarly, in 3-space any three noncoplanar coordinate axes can be used to define a 
coordinate system (Figure 5.4.1c). 
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[a) Coordinates of fi iff* rectangular 
coordinate system in 2-space 



Figure 5.4.1 



ih) Coordinates of P in a 
nonrectangular coordinate 
System in 2-space 



(r) Coordinates of /* in a 
nonrectangular coordinate 
system in 3 -space 



Our first objective in this section is to extend the concept of a coordinate system to general vector spaces. As a start, it will be 
helpful to reformulate the notion of a coordinate system in 2-space or 3-space using vectors rather than coordinate axes to specify 
the coordinate system. This can be done by replacing each coordinate axis with a vector of length 1 that points in the positive 
direction of the axis. In Figure 5.4.2a, for example, vi and V2 are such vectors. As illustrated in that figure, if P is any point in the 
plane, the vector Qp can be written as a linear combination of vi and V2 by projecting P parallel to vi and \ t 2 to make Qp the 
diagonal of a parallelogram determined by vectors av\ and bv^- 

OP = av\ 4- b\ T 2 

It is evident that the numbers a and b in this vector formula are precisely the coordinates of P in the coordinate system of Figure 
5.4. lb. Similarly, the coordinates ( a? b, c) of the point P in Figure 5.4.1c can be obtained by expressing Qp as a linear 
combination of the vectors shown in Figure 5 A. 2b. 
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Informally stated, vectors that specify a coordinate system are called "basis vectors" for that system. Although we used basis 
vectors of length 1 in the preceding discussion, we shall see in a moment that this is not essential — nonzero vectors of any length 
will suffice. 



The scales of measurement along the coordinate axes are essential ingredients of any coordinate system. Usually, one tries to use 
the same scale on each axis and to have the integer points on the axes spaced 1 unit of distance apart. However, this is not always 
practical or appropriate: Unequal scales or scales in which the integral points are more or less than 1 unit apart may be required to 
fit a particular graph on a printed page or to represent physical quantities with diverse units in the same coordinate system (time in 
seconds on one axis and temperature in hundreds of degrees on another, for example). When a coordinate system is specified by a 
set of basis vectors, then the lengths of those vectors correspond to the distances between successive integer points on the 
coordinate axes (Figure 5.4.3). Thus it is the directions of the basis vectors that define the positive directions of the coordinate axes 
and the lengths of the basis vectors that establish the scales of measurement. 
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(a) Rqua I scales. Perpendicular axes, (A) Unequal seales, Perpendicular axes. 
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(c) Equal scales. Skew axes. 
Figure 5.4.3 



{d) Unequal scales. Skew axes. 



The following key definition will make the preceding ideas more precise and enable us to extend the concept of a coordinate 
system to general vector spaces. 



DEFINITION 


2, . . , v n } i s a set °f vectors in V, then S is called a basis for V if the following two 


If V is any vector space and S= {y\,y 


conditions hold: 




(a) S is linearly independent. 




(b) S spans V. 





A basis is the vector space generalization of a coordinate system in 2-space and 3-space. The following theorem will help us to see 
why this is so. 



THEOREM 5.4.1 



Uniqueness 


of Basis Representation 


V, then 


every 


vector v in V 


can 


be expressed in 


the form 


IfS= {v h Y 2 , 


- - -, v M } is a basis for a vector space 


v = c i vi + C2Y2 H 1 - c h v h ^ exac tfy one way. 







Proof Since S spans V, it follows from the definition of a spanning set that every vector in V is expressible as a linear 
combination of the vectors in S. To see that there is only one way to express a vector as a linear combination of the vectors in 5, 
suppose that some vector v can be written as 

v = civi 4- c 2 V2 4- - 4- c n v n 

and also as 

v = k\ vi 4- k 2 V2 H 1- k n v n 

Subtracting the second equation from the first gives 

Q = (ci-jfci)vi I Cc2-jt 2 )v2H--4 (Cn-kn)Vn 

Since the right side of this equation is a linear combination of vectors in S, the linear independence of S 
implies that 

^1—^1 = 0, €2—^2 = 0,..., Cn — kn = 

that is, 

Thus, the two expressions for v are the same. 

■ 

Coordinates Relative to a Basis 

If S= {v\, V2, ..., v H ) is a basis for a vector space V, and 

v = qvH C2V2+- + ChY h 
is the expression for a vector v in terms of the basis 5, then the scalars C \, C2, ..., c n are called the coordinates of v relative to the 
basis 5. The vector (c 1? c 2 , ---,^h) ^ n ^" constructed from these coordinates is called the coordinate vector ofv relative to 5; it is 
denoted by 

Remark It should be noted that coordinate vectors depend not only on the basis S but also on the order in which the basis vectors 
are written; a change in the order of the basis vectors results in a corresponding change of order for the entries in the coordinate 
vectors. 



EXAMPLE 1 Standard Basis for R3 



In Example 3 of the preceding section, we showed that if 

i= (1,0,0), ]= (0,1,0), and k= (0,0,1) 
then S= (i, j, k) is a linearly independent set in p}. This set also spans p} since any vector y=(a,b,c)^R n can be written as 

v=(a,b,c) =a(l,0, 0) I i(0, 1,0) I c(0, 0, l)=ai + £j + ck 



Thus S is a basis for ^ 3 ; it is called the standard basis for R^. Looking at the coefficients of i,j, and A: in 1, it follows that the 
coordinates of v relative to the standard basis are a, b, and c, so 

(y)s= (a, b, c) 

Comparing this result to 1, we see that 

This equation states that the components of a vector v relative to a rectangular xyz-coordinate system and the coordinates of v 
relative to the standard basis are the same; thus, the coordinate system and the basis produce precisely the same one-to-one 
correspondence between points in 3-space and ordered triples of real numbers (Figure 5.4.4). 




b> (Q> LO) 



Figure 5.4.4 



The results in the preceding example are a special case of those in the next example. 



EXAMPLE 2 Standard Basis for R n 



In Example 3 of the preceding section, we showed that if 

ei = (1, 0. 0, ..., 0), e 2 - (0, 1, 0, ..., 0), ..., e H = (0, 0, 0, ..., 1) 

then 

S= {<?i,e2,.-., e H ) 
is a linearly independent set in R n . Moreover, this set also spans R n since any vector v = {v\,V2,---,v n } in.fi" can be written as 



v = viei,v^e^ + »- + v M e > 



(2) 



Thus S is a basis for j?"; it is called the standard basis for R n . It follows from 2 that the coordinates of v = (vi, V2 ? ... ? v M ) relative 
to the standard basis are v ^ ? V2? ... ? v M? so 

(v) 5 =(vi J V2 ? ... J v H ) 
As in Example 1, we have v = (v)s, so a vector v and its coordinate vector relative to the standard basis for R n are the same. 



Remark We will see in a subsequent example that a vector and its coordinate vector need not be the same; the equality that we 
observed in the two preceding examples is a special situation that occurs only with the standard basis for .fi". 



Remark In r 2 and r^, the standard basis vectors are commonly denoted by *',/, and k, rather than by ei, *2> an d *3- We shall use 
both notations, depending on the particular situation. 



EXAMPLE 3 Demonstrating That a Set of Vectors Is a Basis 



Let vi = (l f 2, 1), v 2 = (2, 9, 0), and V3 = (3, 3, 4). Show that the set S = {vi, v 2 , v 3 } is a basis for R 3 . 

Solution 

To show that the set S spans j? 3 , we must show that an arbitrary vector b = (b\, b 2 , ^3) can ^ e ex P resse d as a linear combination 

b=c\y\ I c 2 v 2 I C3Y3 
of the vectors in S. Expressing this equation in terms of components gives 

(Al,42.A 3 )=ei(1.2,l) 1^(2,9,0) hc 3 (3,3,4) 
or 

(Al, b 2? b^) = (c\ 4- 2^2 4- 3c 3 , 2c 1 I 9^2 I 3^3,^1 I 4c 3 ) 
or, on equating corresponding components, 

c\ -\-2c 2 -\-3c^ = b\ 

2ci + 9c 2 + 3c 3 = b 2 (3) 

c\ + 4c 3 = £ 3 

Thus, to show that S spans ^ 3 , we must demonstrate that system 3 has a solution for all choices of b = (b\, b 2 , £3)- 

To prove that 5 is linearly independent, we must show that the only solution of 

c\v\ +^2^2+^3^3 = 



(4) 



is Cl — ^2 = c 3 = 0- As above, if 4 is expressed in terms of components, the verification of independence reduces to showing that 
the homogeneous system 

c\ + 2^2 + 3^3 = 

2c 1 + 9^2 + 3c 3 = (5) 

c { + 4c 3 = 

has only the trivial solution. Observe that systems 3 and 5 have the same coefficient matrix. Thus, by parts (b), (e), and (g) of 
Theorem 4.3.4, we can simultaneously prove that S is linearly independent and spans J? 3 by demonstrating that in systems 3 and 5, 
the matrix of coefficients has a nonzero determinant. From 



,4 = 



"1 


2 


3~ 


2 


9 


3 


1 





4 



we find det(j4) = 



1 


2 


3 


2 


9 


3 


1 





4 






and so S is a basis for j? 3 . 



EXAMPLE 4 Representing a Vector Using Two Bases 



Let s= {v\, v 2 , v 3 } be the basis for j? 3 in the preceding example. 



(a) Find the coordinate vector of v = (5, —1,9) with respect to S. 



(b) Find the vector v in ^3 whose coordinate vector with respect to the basis S is ( v ) = ( — 1, 3, 2)- 



Solution (a) 

We must find scalars c\,C2, £3 such that 

v — c \ v \ +^2 V 2 "t C 3 V 3 
or, in terms of components, 

(5, -l,9)=ci(l,2, 1) I c 2 (2,9,0) I c 3 (3, 3, 4) 
Equating corresponding components gives 

c\ + 2^2 + 3^3 = 5 
2^i-h9c 2 H-3c3= -1 
c { + 4c 3 =9 

Solving this system, we obtain c ^ — ], C2 = — 1, ^3 = 2 (verify). Therefore, 

(v)s=(l, -1,2) 

Solution (b) 

Using the definition of the coordinate vector ( v ) , we obtain 

v=C-l)vi + 3v 2 + 2v 3 
= (-1)0,2.1) I 3(2,9,0) I 2(3, 3,4) = (11, 31,7) 



EXAMPLE 5 Standard Basis for p n 



(a) Show that,? = {1, x, x , ..., x K ) is a basis for the vector space p of polynomials of the form a q I a\x H Va^x"- 

(b) Find the coordinate vector of the polynomial p = fl[| + a ^ x 4 a 2 x 2 relative to the basis S= { 1 , x, x } for p 2 . 



Solution (a) 

We showed that 5 spans p n in Example 1 1 of Section 5.2, and we showed that 5 is a linearly independent set in Example 5 of 
Section 5.3. Thus S is a basis for p n ; it is called the standard basis for p . 

Solution (b) 

The coordinates of p = flQ 4. a j x 4 fl2 x 2 are the scalar coefficients of the basis vectors 1 , x, and x 2 , so (p) s = (^0. ^ 1 . ^2) • 



EXAMPLE 6 Standard Basis for M m „ 

Let 



M\ = 



1 




M 2 = 



1 




M 3 = 





1 



M 4 = 




1 



The set S= {M\, Mj, Mi, M4} is a basis for the vector space M22 °f 2 x 2 matrices. To see that S spans Mj^ note tnat an 
arbitrary vector (matrix) 

a b 
c d 

can be written as 



a b 


= a 


"1 0" 


1 b 


"0 r 


+ r 


"0 0" 


1 d 


"0 0" 


c d 




[0 uj 




[0 uj 




[1 uj 




[0 lj 



= aM\ I bM 2 I cM 3 + d?ilf 4 
To see that S is linearly independent, assume that 

aM\+bM 2 I cikf 3 I dM 4 = 
That is, 



It follows that 





h nl 




rn 11 




rn nl 




rn nl 




rn nl 


a 




1 b 




+ r 




1 d 




= 






[0 uj 




[u uj 




[1 uj 




[0 lj 




L° °J 



a b 
c d 








Thus a = b = c = d = 0, so S is linearly independent. The basis 5 in this example is called the standard basis for M 22 - More 
generally, the standard basis for M mH consists of the m ^ different matrices with a single 1 and zeros for the remaining entries. 



EXAMPLE 7 Basis for the Subspace span(S) 



If S= {vi, V2, -.., y r ) i s a linearly independent set in a vector space V, then 5 is a basis for the subspace span(S) since the set S 
spans span(S) by definition of span(S). 



DEFINITION 



A nonzero vector space V is called finite-dimensional if it contains a finite set of vectors {v\,Y 2 ,... r Y n } that forms a basis. If 
no such set exists, V is called infinite-dimensional. In addition, we shall regard the zero vector space to be finite dimensional. 



EXAMPLE 8 Some Finite- and Infinite-Dimensional Spaces 



By Examples Example 2, Example 5, and Example 6, the vector spaces R n , p 9 and M mn are finite-dimensional. The vector spaces 

F( — do , do ), C( — do , do )> C m ( — °° - °° )» an d C "^ ( — ,:x:i - ™ ) are infinite-dimensional (Exercise 24). 

The next theorem will provide the key to the concept of dimension. 



THEOREM 5.4.2 



Let Vbe a finite-dimensional vector space, and let {y\, v 2 , ---, v n ) be any basis. 

(a) If a set has more than n vectors, then it is linearly dependent. 

(b) If a set has fewer than n vectors, then it does not span V. 



Proof (a) Let S 9 = {w\, w 2 , . . ., w m } be any set of m vectors in V, where m > M . We want to show that S f is linearly dependent. 
Since g= (y 1? y 2? ___, v n ) is a basis, each w 2 can be expressed as a linear combination of the vectors in S, say 

wi =anvi +^21^2 + -4- a n{ v n 
w 2 =fli2Vi +a 2 2V2 H h a H2 v H 

i ; ; ; ( 6 ) 

W m = ai m vi 4- A2mV2 H h a Hm v H 

To show that £ ,r is linearly dependent, we must find scalars ki,k 2 ,...,k m , not all zero, such that 

iiwi + k 2 w 2 + »■ 4- k m w m = ^ 

Using the equations in 6, we can rewrite 7 as 

I (^1^21 I *2fl22+"" + *mfl2m)v2 

+ (Jtifl H H Jt2flM2+™ + *mflHm)v H =0 

Thus, from the linear independence of S, the problem of proving that s* is a linearly dependent set 
reduces to showing there are scalars fc 1? k 2 , ...,k m , not all zero, that satisfy 

^11^1+^12^2 I L,J I ^lm^m = 

-321*1 + fl 22*2 + - + a 2 mk m = 

Co) 

fl H i^i-hfl H 2*2 I ^™^ = 

But 8 has more unknowns than equations, so the proof is complete since Theorem 1.2.1 guarantees the 
existence of nontrivial solutions. 



Proof (b) Let S 9 = {w\, w 2 , w m } be any set of m vectors in V, where m < n . We want to show that S f does not span V. The 
proof will be by contradiction: We will show that assuming S* spans V leads to a contradiction of the linear independence of 

{vi,V2,.-.,v H }. 

If S* spans V, then every vector in Vis a linear combination of the vectors in S f . In particular, each basis vector y i is a linear 
combination of the vectors in S* 9 say 

vi =flnwi +fl2lW2+™ + flffliw J „ 
v 2 = fli2Wi 4- ^22^2 4- ■» 4- a m2 w m 

Eli: ^ ^ 

v H = a i„wi -h ^2hW2 4- ■» 4- a mH w m 
To obtain our contradiction, we will show that there are scalars k\, k 2 , ..., k n , not all zero, such that 

iivi + k 2 v 2 4- - 4- k n Y n = 



But observe that 9 and 10 have the same form as 6 and 7 except that m and n are interchanged and the w's and v's are interchanged. 
Thus the computations that led to 8 now yield 

a\\k\ + fli2*l H 1- ^Ih^h = 

fl21*l + ^22*2 I" - + a 2n^n = ° 

This linear system has more unknowns than equations and hence has nontrivial solutions by Theorem 1.2.1. 

I 

It follows from the preceding theorem that if g= { V i, \*2, ..., v H } is an y basis for a vector space V, then all sets in V that 
simultaneously span V and are linearly independent must have precisely n vectors. Thus, all bases for V must have the same 
number of vectors as the arbitrary basis S. This yields the following result, which is one of the most important in linear algebra. 

THEOREM 5.4.3 



All bases for a finite-dimensional vector space have the same number of vectors. 



To see how this theorem is related to the concept of "dimension," recall that the standard basis for R n has n vectors (Example 2). 
Thus Theorem 5.4.3 implies that all bases for R n have n vectors. In particular, every basis for p} has three vectors, every basis for 
p 2 has two vectors, and every basis for i? 1 ( = R) has one vector. Intuitively, p} is three-dimensional, p} (a plane) is 

two-dimensional, and R (a line) is one-dimensional. Thus, for familiar vector spaces, the number of vectors in a basis is the same 
as the dimension. This suggests the following definition. 



DEFINITION 



The dimension of a finite-dimensional vector space V, denoted by dim( V), is defined to be the number of vectors in a basis for 
V. In addition, we define the zero vector space to have dimension zero. 



Remark 

From here on we shall follow a common convention of regarding the empty set to be a basis for the zero vector space. This is 
consistent with the preceding definition, since the empty set has no vectors and the zero vector space has dimension zero. 



EXAMPLE 9 Dimensions of Some Vector Spaces 



dimf-R") = n [The standard basis has n vectors (Example 2).] 

d™^^) = n + 1 [The standard ha sis has n + 1 vectors (Example ?) ] 
dim(ilf mn ) = mn [The standard ha sis has mn vectors (Example 6).] 



EXAMPLE 10 Dimension of a Solution Space 



Determine a basis for and the dimension of the solution space of the homogeneous system 

2*i I 2*2 — *3 + *5 = 

— *i — *2 I 2*3 — 3*4 i *3 = 

*l+*2 — 2*3 — x$=0 

*3+*4 + *5 = Q 



Solution 

In Example 7 of Section 1.2 it was shown that the general solution of the given system is 

*1 = — s — t, *2 — ^ *3 — — ^ *4 = ? x$ = £ 

Therefore, the solution vectors can be written as 
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*3 
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-t 
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-t 
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1 



which shows that the vectors 



vi = 



-1 
1 






and 



vi = 



~-f 





-1 





1 



span the solution space. Since they are also linearly independent (verify), (y 1? V2 } is a basis, and the solution space is 
two-dimensional. 

Some Fundamental Theorems 

We shall devote the remainder of this section to a series of theorems that reveal the subtle interrelationships among the concepts of 
spanning, linear independence, basis, and dimension. These theorems are not idle exercises in mathematical theory — they are 
essential to the understanding of vector spaces, and many practical applications of linear algebra build on them. 

The following theorem, which we call the Plus/Minus Theorem (our own name), establishes two basic principles on which most of 
the theorems to follow will rely. 



THEOREM 5.4.4 



Plus/Minus Theorem 

Let S be a nonempty set of vectors in a vector space V. 

(a) If S is a linearly independent set, and ifv is a vector in V that is outside ofspan(S), then the set Su (v) that results by 
inserting v into S is still linearly independent. 

(b) If v is a vector in S that is expressible as a linear combination of other vectors in S, and if S— fv) denotes the set 
obtained by removing v from S, then S and S— fv) span the same space; that is, 

span(S') = span(£— (v) ) 



We shall defer the proof to the end of the section, so that we may move more immediately to the consequences of the theorem. 
However, the theorem can be visualized in ^ 3 as follows: 

(a) A set S of two linearly independent vectors in j? 3 spans a plane through the origin. If we enlarge S by inserting any vector v 
outside of this plane (Figure 5.4.5a), then the resulting set of three vectors is still linearly independent since none of the 
three vectors lies in the same plane as the other two. 




{a} None ttfthc three 
vcciora tics in the 
same plane as the 
other two. 

Figure 5.4.5 




(7j) Any of the vectors can 
he removed, and the 

remaining two will stiil 
span ihc plane. 




{£} Either of [he collincar vectors 
can be removed and Hie 
remaining nvo will slill 
span the plane. 



(b) If S is a set of three noncollinear vectors in j? 3 that lie in a common plane through the origin (Figure 5.4. 5Z?, c), then the 
three vectors span the plane. However, if we remove from S any vector v that is a linear combination of the other two, then 
the remaining set of two vectors still spans the plane. 

In general, to show that a set of vectors (y 1? y 2 , ..., v H } is a basis for a vector space V, we must showthat the vectors are linearly 
independent and span V. However, if we happen to know that V has dimension n (so that (y 1? y 2? ...,¥„} contains the right 
number of vectors for a basis), then it suffices to check either linear independence or spanning — the remaining condition will hold 
automatically. This is the content of the following theorem. 

THEOREM 5.4.5 



IfVis an n-dimensional vector space, and ifS is a set in V 'with exactly n vectors, then S is a basis for V if either S spans V or S 
is linearly independent. 



Proof Assume that S has exactly n vectors and spans V. To prove that S is a basis, we must show that S is a linearly independent 
set. But if this is not so, then some vector v in S is a linear combination of the remaining vectors. If we remove this vector from 5, 
then it follows from the Plus/Minus Theorem (Theorem 5.4.4Z?) that the remaining set of ^ — 1 vectors still spans V. But this is 
impossible, since it follows from Theorem 5.4.2Z? that no set with fewer than n vectors can span an n-dimensional vector space. 
Thus S is linearly independent. 

Assume that S has exactly n vectors and is a linearly independent set. To prove that S is a basis, we must show that S spans V. But 
if this is not so, then there is some vector v in V that is not in span(S). If we insert this vector into 5, then it follows from the 
Plus/Minus Theorem (Theorem 5.4.4a) that this set of n + 1 vectors is still linearly independent. But this is impossible, since it 
follows from Theorem 5.4.2a that no set with more than n vectors in an ^-dimensional vector space can be linearly independent. 
Thus S spans V. 



EXAMPLE 1 1 Checking for a Basis 



(a) Show that V i = ( — 3, 7) and V2 = (5, 5) f° rm a basis for j? 2 by inspection. 

(b) Show that Vl = (2, 0, — 1)> V2 = (4, 0, 7)> an d Y3 = ( — 1, 1,4) f° rm a basis for gl by inspection. 



Solution (a) 

Since neither vector is a scalar multiple of the other, the two vectors form a linearly independent set in the two-dimensional space 
j? 2 , and hence they form a basis by Theorem 5.4.5. 

Solution (b) 

The vectors vi and yj form a linearly independent set in the ^z-plane (why?). The vector y 3 is outside of the ^z-plane, so the set 
(vi, ¥2, V3} is also linearly independent. Since J? 3 is three-dimensional, Theorem 5.4.5 implies that (y 1? y 2? V3} is a basis for j? 3 . 

The following theorem shows that for a finite-dimensional vector space V , every set that spans V contains a basis for V within it, 
and every linearly independent set in V is part of some basis for V. 

THEOREM 5.4.6 



Let S be a finite set of vectors in a finite-dimensional vector space V. 

(a) IfS spans V but is not a basis for V, then S can be reduced to a basis for V by removing appropriate vectors from S. 

(b) If S is a linearly independent set that is not already a basis for V, then S can be enlarged to a basis for V by inserting 
appropriate vectors into S. 



Proof (a) If S is a set of vectors that spans V but is not a basis for V, then S is a linearly dependent set. Thus some vector v in S is 
expressible as a linear combination of the other vectors in S. By the Plus/Minus Theorem (Theorem 5.4.4/?), we can remove v from 
5, and the resulting set S f will still span V. If S f is linearly independent, then S f is a basis for V, and we are done. If S f is linearly 
dependent, then we can remove some appropriate vector from S* to produce a set S ff that still spans V. We can continue removing 
vectors in this way until we finally arrive at a set of vectors in S that is linearly independent and spans V. This subset of S is a basis 
forV. 



Proof (b) Suppose that dim(F) = «. If S is a linearly independent set that is not already a basis for V, then S fails to span V, and 
there is some vector v in V that is not in span(S). By the Plus/Minus Theorem (Theorem 5.4.4a), we can insert v into S, and the 
resulting set £ f will still be linearly independent. If £ f spans V, then £ f is a basis for V, and we are finished. If £ f does not span V, 
then we can insert an appropriate vector into S* to produce a set S ft that is still linearly independent. We can continue inserting 
vectors in this way until we reach a set with n linearly independent vectors in V. This set will be a basis for V by Theorem 5.4.5. 



It can be proved (Exercise 30) that any subspace of a finite-dimensional vector space is finite-dimensional. We conclude this 
section with a theorem showing that the dimension of a subspace of a finite-dimensional vector space V cannot exceed the 
dimension of V itself and that the only way a subspace can have the same dimension as Vis if the subspace is the entire vector 
space V. Figure 5.4.6 illustrates this idea in p}. In that figure, observe that successively larger subspaces increase in dimension. 
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Figure 5.4.6 



THEOREM 5.4.7 



If Wis a subspace of a finite-dimensional vector space V, then dim(JF) < dtm(V)>' moreover, if 'dim (IF) = &m{V)> then W=V- 



Proof Since Vis finite-dimensional, so is Why Exercise 30. Accordingly, suppose that g= ( Wl? W2? ___, w m ) is a basis for W. 
Either S is also a basis for V or it is not. If it is, then dim(FF) = dim(F r ) = j»- If it is not > then by Theorem 5.4.66, vectors can be 
added to the linearly independent set S to make it into a basis for V, so dim (HP) < dim(P r )- Thus dim (HP) < dim(F) i n all cases. If 
dim(RF) = dim(P) 9 then 5 is a set of m linearly independent vectors in the m-dimensional vector space V ; hence S is a basis for V 
by Theorem 5.4.5. This implies that W=V (why?). 



Additional Proofs 



Proof of Theorem 5.4.4a Assume that g= ( y 1 r y 2 , - - ., v r } I s a linearly independent set of vectors in V, and v is a vector in V 
outside of span(S). To show that S* = {v\, \'2, ..., v r? v) is a linearly independent set, we must show that the only scalars that 
satisfy 



k\Y\ + ^2 V 2 H 1" &r Y r + ^r+1 v = 



(ID 



are ^ = £ 2 = - = ^=^+i = 0- But we must have £ r+1 = o; otherwise, we could solve 11 for v as a linear 
combination of Vu y 2 ,..., v r , contradicting the assumption that v is outside of span(S). Thus 11 simplifies 
to 



^m I i2V2 + "" + ^Vj. = 



(12) 



which, by the linear independence of { Vl , y 2 , ..., v 7 ) , implies that 



Proof of Theorem 5.4.4b Assume that g= { Vl r y 2? . . _ ? v r } is a set of vectors in V, and to be specific, suppose that v r is a linear 
combination of y 1? y 2? ...j^, say 



v, = qv 2 + c 2 v 2 + - + c r _i\>_i 

We want to show that if v r is removed from S, then the remaining set of vectors (y 1? v 2 , ..., v r -\) sti " 
spans span(S); that is, we must show that every vector w in span(S) is expressible as a linear 
combination of {y 1? y 2 , ..., \>_i) ■ But if w is in span(S), then w is expressible in the form 

w = k\ vi + k2^2 H 1" ^r-l Y r-\ + ^r v r 

or, on substituting 13, 

w = iivi I ^2^2 I L,J I ij--iv r _H k r (c\vi I c 2 v 2 I -c r -\v r -i) 

which expresses w as a linear combination of Vu V2 , .__, Vr _ lm 



(13) 



Exercise Set 5.4 



& 



Click here for Just Ask! 



1. 



Explain why the following sets of vectors are not bases for the indicated vector spaces. (Solve this problem by inspection.) 



(a) ui = (1, 2), u 2 = (0, 3), u 3 = (2, 7) for 5" 



(b) m = ( - 1, 3, 2), n 2 = (6, 1, 1) for R 2 



(c) pi = 1 + x + x 2 , p2 = x - 1 for P 2 



(d) 



,4 = 



1 1 

2 3 



5 = 



6 
-1 4 



C = 



3 
1 7 



Z) = 



5 1 
4 2 



5 = 



7 1 
2 9 



for ilf 22 



2. 



Which of the following sets of vectors are bases for j? 2 



(a) (2, 1), (3, 0) 



(b) (4, 1), (-7, -8) 



3. 



(c) (0,0), (1,3) 

(d) (3, 9), (-4, -12) 

Which of the following sets of vectors are bases for ^3? 



(a) (1,0,0), (2, 2,0), (3, 3, 3) 



(b) (3, 1,-4), (2, 5, 6), (1,4, 8) 



(c) (2, 3, 1), (4, 1, 1), (0, -7, 1) 



(d) (1,6, 4), (2, 4,-1), (-1,2, 5) 



Which of the following sets of vectors are bases for p~p 



(a) 1-3* I 2x 2 , 1 + *4 4* 2 , 1-7* 



(b) 4 4 6x 4 x 2 , - 1 + Ax 4 2x 2 , 5 I 2* - x 2 



(c) l+i + i 2 , *4* 2 , x 2 



(d) _4 4*4 3* 2 , 6 + 5x + 2x 2 , 844*4. 



5. 



Show that the following set of vectors is a basis for M.22- 



3 6 
3 -6 



-1 
-1 



-8 
-12 -4 



1 
-1 2 



Let Vbe the space spanned by Vl = C0S 2 X , V2 = s^ 2 *, V3 = cos2*- 



(a) Show that S= (v\, V2, V3} is not a basis for V. 



(b) Find a basis for V. 



Find the coordinate vector of w relative to the basis S = {uj , 112 } f° r R 2 



(a) ui = (l,0), u 2 =(0, 1); w=(3, -7) 



(b) m = (2, -4), u 2 = (3,8); w=(l, 1) 



(c) ui = (l, 1), u 2 =(0,2); w=(fl,A) 



8. 



Find the coordinate vector of w relative to the basis % = {n\ , 112 } °f R 2 - 



(a) ui = (l, -1), ii 2 = (1,1); w=(l,0) 



(b) ui = (l, -1), ii 2 = (1,1); w=(0, 1) 



(c) ni = (l, -1), ii 2 = (1,1); w= (1,1) 



9. 



Find the coordinate vector of v relative to the basis s= {vj, V2, V3} 



10. 



(a) v= (2, - 1, 3); v t = (1, 0, 0), v 2 (2, 2, 0), v 3 = (3, 3, 3) 



(b) v=(5, -12,3); vi = (1,2,3). v 2 ( -4,5,6). v 3 = (7, -8,9) 



Find the coordinate vector ofp relative to the basis s= (m, i>2, 1)3} 



(a) 1)= 4-3* I * 2 ; pi = 1, 1*2=*, P3=*" 



(b) p = 2-* I x\ pi = 1 -f x, p 2 = l+* , p 3 =* + *' 



11. 



Find the coordinate vector of A relative to the basis S= {A\, Aj, A$, A&) • 



A = 



2 
-1 3 



Ai = 



-1 1 




A 2 = 



1 1 




A 3 = 





1 



A 4 = 




1 



In Exercises 12-17 determine the dimension of and a basis for the solution space of the system. 
*l + *2-*3 =0 



12. 



13. 



14. 



— 2*i — X2 I 2*3 = 
— xi +X3 =0 

3*i I X2 I *3 I X4= 
5*1— *2 I *3— *4=0 

X 1—4*2 I 3*3 — *4=0 
2*1 — 8*2 I 6*3 — 2*4=0 



*1 — 3*2 4- *3 = 
2*i — 6*2 I 2*3 = 
3x1-9*2 + 3*3 = 

2*i +*2 + 3*3 = 

16, *i +5*3 = 

*2 +*3 = 

* +7 +z=0 

17 - 3* j 2^-2z = 

4*H 3^ -z = 

6* + 5^ + z = 

Determine bases for the following subspaces of J? 3 . 
18. 

(a) theplane3*-27 + 5z = 

(b) the plane K — y = 

(c) the line K = 2t, y = - 1, z = 4t 

(d) all vectors of the form (a r b r c)> where b=a-\-c 

Determine the dimensions of the following subspaces of ^ 4 . 
19. 

(a) all vectors of the form (a,b,c,0) 

(b) all vectors of the form ( flj b, c , d), where d = a \ b and c = a-b 

(c) all vectors of the form ( flj b, c, d), where a =b=c=d 

Determine the dimension of the subspace of p 3 consisting of all polynomials flQ \-a\x-\ A2* 2 I ^3* 3 f° r which flQ — 0- 

Find a standard basis vector that can be added to the set (y 1? y 2 } to produce a basis for ^ 3 . 
21. 

(a) vi = (-1,2, 3), v 2 = (l, -2, -2) 

(b) vi = (1, -1,0), v 2 = (3, 1, -2) 

Find standard basis vectors that can be added to the set ( Vl? V2 } to produce a basis for 

22 ' vi = (1, -4,2, -3), v 2 = (-3, 8, -4,6) 



20. 



Let (y 1? V2 ? v 3 ) be a basis for a vector space V. Show that ( Ul? U2 ? 113} is also a basis, where ui = vi, 112 = vi I V2» and 
113 = vi + V2 4- V3. 

24. 

(a) Show that for every positive integer n, one can find ^ | 1 linearly independent vectors in ^( _ do r oq ). 

/fzizl Look for polynomials. 

(b) Use the result in part (a) to prove that ^f _ 00 r 00 ) is infinite-dimensional. 

(c) Prove that C( — 00 , og ), C m ( — 00 ? 00 ), and C°°( — 00 , do ) are infinite-dimensional vector spaces. 

Let S be a basis for an ^-dimensional vector space V. Show that if y 1? V2, --- r v r form a linearly independent set of vectors in V, 
25- then the coordinate vectors ( Vl ) (y 2 ) ., (y r ) 5 form a linearly independent set in ,£", and conversely. 

Using the notation from Exercise 25, show that if y 1? y 2j ___, y r span V, then the coordinate vectors ( Vl ) r ( V2 ) , ..., (v r ) span 
"• 5", and conversely. 

Find a basis for the subspace of p^ spanned by the given vectors. 
27. 

(a) - 1 1 x - 2* 2 , 3 I 3x +■ 6* 2 , 9 

(b) 1 | *, * 2 , - 2 I 2* 2 , - 3* 

(c) 1 1 x - 3* 2 , 2 4- 2x - &x 2 , 3 + 3x - 9x 2 

Hint Let S be the standard basis for p 2 and work with the coordinate vectors relative to 5; note Exercises 25 and 26. 

The accompanying figure shows a rectangular ^-coordinate system and an x 9 y '-coordinate system with skewed axes. 
°* Assuming that 1-unit scales are used on all the axes, find the x y -coordinates of the points whose ^-coordinates are given. 

(a) (1, 1) 

(b) (1,0) 

(c) (0, 1) 

(d) (a,b) 




i- ami v J 



Figure Ex-28 



fj 



The accompanying figure shows a rectangular ^-coordinate system determined by the unit basis vectors i andy and an x y 
2 ^' -coordinate system determined by unit basis vectors uj and 112- Find the x f y '-coordinates of the points whose ^-coordinates 
are given. 



(a) (^3 P 1) 



(b) (1,0) 



(c) (0, 1) 



(d) (a,b) 




Figure Ex-29 



30. 



Prove: Any subspace of a finite-dimensional vector space is finite-dimensional. 



Discussion 

OV&rv ^ e b as i s that we gave for ^2 in Example 6 consisted of noninvertible matrices. Do you think that 



31- there is a basis for M22 consisting of invertible matrices? Justify your answer. 



32. 



(a) The vector space of all diagonal fl x « matrices has dimension 



(b) The vector space of all symmetric HX « matrices has dimension . 



(c) The vector space of all upper triangular n x n matrices has dimension 



33. 

(a) For a 3 x 3 matrix A, explain in words why the set / 3 , A, A \ . . ., ^4 must be linearly 

dependent if the ten matrices are distinct. 

(b) State a corresponding result for an n x n matrix A. 

State the two parts of Theorem 5.4.2 in contrapositive form. [See Exercise 34 of Section 1.4.] 
34. 



35. 

(a) The equation x 1 | K2 -\ h * H = can ^ e v i ewe d as a linear system of one equation in n 

unknowns. Make a conjecture about the dimension of its solution space. 



(b) Confirm your conjecture by finding a basis. 



36. 

(a) Show that the set W of polynomials in p 2 such that ^(l)=Qisa subspace of ^3- 



(b) Make a conjecture about the dimension of W. 

(c) Confirm your conjecture by finding a basis for W. 
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EC In this section we shall study three important vector spaces that are associated 

with matrices. Our work here will provide us with a deeper understanding of the 
ROW SPACE, COLUMN relationships between the solutions of a linear system of equations and 

SPACE, AND NULLSPACE properties of its coefficient matrix. 



We begin with some definitions. 



DEFINITION 






For an^x^ matrix 






"an a\2 - ^1h " 






,4 = 


fl21 ^22 L,J &2n 






the vectors 




ri = [an a 12 ■" a u] 




T 2 =[a 2 \ a 2 2 - a 2n ] 




r m— L^ml a m2 a mn\ 




in R n formed from the rows of A are called the row vectors of A, and the vectors 






"an" 




"^12" 




"ai M " 




C\ = 


^21 


^2 = 


^22 
^2 


? ... ? c H — 


fl 2H 




inR m formed from the columns of A are called the column vectors of A. 





EXAMPLE 1 Row and Column Vectors in a 2 x 3 Matrix 



Let 



The row vectors of A are 



and the column vectors of A are 



,4 = 



2 1 
3-14 



H= [2 1 0] and r 2 = [3 -14] 



ci = 



^2 = 



and 



c 3 = 



The following definition defines three important vector spaces associated with a matrix. 



DEFINITION 



If A is an m x n matrix, then the subspace of R n spanned by the row vectors of A is called the rowspace of A, and the subspace 
of R m spanned by the column vectors of A is called the column space of A. The solution space of the homogeneous system of 
equations Ax = 0, which is a subspace of R n , is called the nullspace of A. 



In this section and the next we shall be concerned with the following two general questions: 

* 
What relationships exist between the solutions of a linear system Ax = b and the row space, column space, and nullspace of 
the coefficient matrix A? 

What relationships exist among the row space, column space, and nullspace of a matrix? 



To investigate the first of these questions, suppose that 

,4 = 



a n 


012 " 


" fl lM 


^21 


^22 ■ 


" &2n 


a ml 


a m2 m 


' a mn 



and 



x = 



*1 
*2 



It follows from Formula 10 of Section 1.3 that if cj, C2, ---, c H denote the column vectors of A, then the product Ax can be 
expressed as a linear combination of these column vectors with coefficients fromx; that is, 

Thus a linear system, Ax = b> of m equations in n unknowns can be written as 

*ici +*2C2 + - + * H c H =b 



(1) 



(2) 



from which we conclude that As = b is consistent if and only if b is expressible as a linear combination of the column vectors of A 
or, equivalently, if and only if b is in the column space of A. This yields the following theorem. 



THEOREM 5.5.1 



A system of linear equations Ax = b is consistent if and only ifb is in the column space of A. 



EXAMPLE 2 A Vector b in the Column Space of A 



Let Ax = b be the linear system 



-1 3 


2 


"*l" 




f 


1 2 


-3 


*2 


= 


-9 


2 1 


-2 


*3 




-3 



Show that b is in the column space of A, and express b as a linear combination of the column vectors of A. 



Solution 

Solving the system by Gaussian elimination yields (verify) 

x\ =2, x 2 = - 1, * 3 = 3 
Since the system is consistent, b is in the column space of A. Moreover, from 2 and the solution obtained, it follows that 



~-l" 




"3" 




2" 




f 


1 


— 


2 


1 3 


-3 


= 


-9 


2 




1 




-2 




-3 



The next theorem establishes a fundamental relationship between the solutions of a nonhomogeneous linear system Ax = h and 
those of the corresponding homogeneous linear system Ax = with the same coefficient matrix. 



THEOREM 5.5.2 



I/xq denotes any single solution of a consistent linear system Ax — h, and ifv\, v 2 , •••> y k f orm a basis for the nullspace of 
A — that is, the solution space of the homogeneous system Ax = — then every solution of Ax = h can be expressed in the form 



x = x +civi I C2 V 2+- + c k v k 

and, conversely, for all choices of scalars c\, c 2l ... , c kl the vector x in this formula is a solution of 
,4x = b. 



(3) 



Proof Assume that xq is any fixed solution of Ax = h and that x is an arbitrary solution. 

Then 

Axq = b and Ax = b 
Subtracting these equations yields 

A\- J 4xn = or J 4(x-xn)=0 

which shows that x — xq is a solution of the homogeneous system ^ = 0- Since vi, *2> • • • v^ is a basis for the solution space of 
this system, we can express x — xq as a linear combination of these vectors, say 

x - x = c i vi I c 2 V2 4- - 4- c k v k 
Thus, 

x = x 4- civi 4 c 2 V2 4 - 4 c k v k 
which proves the first part of the theorem. Conversely, for all choices of the scalars c\ 9 c 2 ,- • ., c k in 3, we have 

Ax = A(xq 4- c i vi 4- c 2 v 2 4- - 4 c k v k ) 

or 

^x = ^x +^l(^vi) I c 2 (Ay 2 )+ + c k (Av k ) 

But xq is a solution of the nonhomogeneous system, and v\, V2»- • •> vjt are solutions of the homogeneous system, so the last 
equation implies that 

Ax = h I I OH h0=b 

which shows that x is a solution of Ax = b- 



General and Particular Solutions 

There is some terminology associated with Formula 3. The vector xq is called ^particular solution of Ax = h- The expression 



x + civi + C2V2 H h cfcYk is called the general solution of ^x = b, and the expression c\y\ + ^2 V 2 H 1- c k Y k * s ca U e d the 

general solution of Ay = 0- With this terminology, Formula 3 states that the general solution of Ax = h is the sum of any particular 
solution of Ax = h and the general solution of Ax = 0- 

For linear systems with two or three unknowns, Theorem 5.5.2 has a nice geometric interpretation in j? 2 and J? 3 . For example, 
consider the case where Ax = and Ax = h are linear systems with two unknowns. The solutions of Ax = form a subspace of R 2 
and hence constitute a line through the origin, the origin only, or all of J? 2 . From Theorem 5.5.2, the solutions of ^4 X = h can be 
obtained by adding any particular solution of Ax = b, say xq, to the solutions of Ax = 0- Assuming that xq is positioned with its 
initial point at the origin, this has the geometric effect of translating the solution space of ^4 X = 0, so that the point at the origin is 
moved to the tip of xq (Figure 5.5.1). This means that the solution vectors of Ax = b form a line through the tip of xq, the point at 
the tip of xq, or all of R 2 . (Can you visualize the last case?) Similarly, for linear systems with three unknowns, the solutions of 
Ax = b constitute a plane through the tip of any particular solution xq, a line through the tip of xq, the point at the tip of xq, or all of 
R 2 - 




Solution space 
at Ax = II 




uo 



(/]) 



Figure 5.5.1 



Adding xq to each vector x in the solution space of ^x == translates the solution space. 



EXAMPLE 3 General Solution of a Linear System Ax = b 



In Example 4 of Section 1.2 we solved the nonhomogeneous linear system 

*1 4 3*2 — 2*3 I 2*5 =0 

2*i I 6*2 — 5*3 —2*4 I 4*5 — 3*6 = — 1 

5*3 I 10*4 + 15*6 =5 

2*i + 6*2 +3*4 I 4*5 I lS*^ =5 



and obtained 

*1= -3r-4s-2t, 
This result can be written in vector form as 



X2=r, 



*T = — 2s, 



X4 = s, 



~*l" 




*2 




*3 
*4 


= 


*5 




*6 




L 





- 3r - 4s - 2t 




V 


r 







-2s 







s 


= 





t 







1 
3 




1 

3 



(4) 



x$ = t, 



*6= 3 



+ r 



' -3 




' -A 




' -2 


1 
















+ s 


-2 
1 


+ t 
















1 















L J 








L -1 



(5) 



*o 



which is the general solution of 4. The vector xq in 5 is a particular solution of 4; the linear combination x in 5 is the general 



solution of the homogeneous system 



*1 ^ 3*2 — 2*3 I 2*3 =0 

2*i-l 6*2 — 5*3 —2*4 i 4*3 — 3*6 = 

5*3 I 10*4 + 15*^ = 

2*i + 6*2 +8*4 I 4*3 I 18*6 = 



(verify). 



Bases for Row Spaces, Column Spaces, and Nullspaces 

We first developed elementary row operations for the purpose of solving linear systems, and we know from that work that 
performing an elementary row operation on an augmented matrix does not change the solution set of the corresponding linear 
system. It follows that applying an elementary row operation to a matrix A does not change the solution set of the corresponding 
linear system Ax = 0, or, stated another way, it does not change the nullspace of A. Thus we have the following theorem. 



THEOREM 5.5.3 



Elementary row operations do not change the nullspace of a matrix. 



EXAMPLE 4 Basis for Nullspace 



Find a basis for the nullspace of 



,4 = 



2 


2 


-1 





1 


1 


-1 


2 


-3 


1 


1 


1 


-2 





-1 








1 


1 


1 



Solution 

The nullspace of A is the solution space of the homogeneous system 

2*i i 2*2 — *3 +*5 = 

— *l — *2 I 2*3 — 3*4 I *5 = 

*1 I *2 — 2*3 — *5 = 

*3 I *4 I *5 = 

In Example 10 of Section 5.4 we showed that the vectors 





"-f 






"-f 




1 









▼1 = 







and 


>'2 = 


-1 

1 



form a basis for this space. 



The following theorem is a companion to Theorem 5.5.3. 



THEOREM 5.5.4 



Elementary row operations do not change the row space of a matrix. 



Proof Suppose that the row vectors of a matrix A are n, r2, • • • , r ffl , and let B be obtained from A by performing an elementary 
row operation. We shall show that every vector in the row space of B is also in the row space of A and that, conversely, every 
vector in the row space of A is in the row space of B. We can then conclude that A and B have the same row space. 

Consider the possibilities: If the row operation is a row interchange, then B and A have the same row vectors and consequently 
have the same row space. If the row operation is multiplication of a row by a nonzero scalar or the addition of a multiple of one 
row to another, then the row vectors r ' 9J ?l. . , r ' of B are linear combinations of n, r^ • • -»r ffl ; thus they lie in the row space of A. 
Since a vector space is closed under addition and scalar multiplication, all linear combinations of ^ , A, . . ., r ' will also lie in the 
row space of A. Therefore, each vector in the row space of B is in the row space of A. 

Since B is obtained from A by performing a row operation, A can be obtained from B by performing the inverse operation (Section 
1.5). Thus the argument above shows that the row space of A is contained in the row space of B. 

■ 

In light of Theorems Theorem 5.5.3 and Theorem 5.5.4, one might anticipate that elementary row operations should not change the 
column space of a matrix. However, this is not so — elementary row operations can change the column space. For example, 
consider the matrix 



,4 = 



1 3 

2 6 



The second column is a scalar multiple of the first, so the column space of A consists of all scalar multiples of the first column 
vector. However, if we add -2 times the first row of A to the second row, we obtain 



B = 



1 3 




Here again the second column is a scalar multiple of the first, so the column space of B consists of all scalar multiples of the first 
column vector. This is not the same as the column space of A. 

Although elementary row operations can change the column space of a matrix, we shall show that whatever relationships of linear 
independence or linear dependence exist among the column vectors prior to a row operation will also hold for the corresponding 
columns of the matrix that results from that operation. To make this more precise, suppose a matrix B results from performing an 
elementary row operation onan^xw matrix A. By Theorem 5.5.3, the two homogeneous linear systems 

,4x = and Bx = 
have the same solution set. Thus the first system has a nontrivial solution if and only if the same is true of the second. But if the 
column vectors of A and 5, respectively, are 

ci,C2, ..., c H and c^^,-.-, c M 

then from 2 the two systems can be rewritten as 



;qci +*2C2 + --- + *hC h = 
and 

x\c[ +*2 C 2 +--- + *h C H = 



(6) 



(7) 



Thus 6 has a nontrivial solution for x\, X2> • • - 9 x n if and only if the same is true of 7. This implies that the column vectors of A are 
linearly independent if and only if the same is true of B. Although we shall omit the proof, this conclusion also applies to any 
subset of the column vectors. Thus we have the following result. 



THEOREM 5.5.5 



If A and B are row equivalent matrices, then 



(a) A given set of column vectors of A is linearly independent if and only if the corresponding column vectors ofB are 
linearly independent. 

(b) A given set of column vectors of A forms a basis for the column space of A if and only if the corresponding column 
vectors of B form a basis for the column space ofB. 



The following theorem makes it possible to find bases for the row and column spaces of a matrix in row-echelon form by 
inspection. 

THEOREM 5.5.6 



If a matrix R is in row-echelon form, then the row vectors with the leading V ' s (the nonzero row vectors) form a basis for the 
row space ofR, and the column vectors with the leading 1 ' s of the row vectors form a basis for the column space ofR. 



Since this result is virtually self-evident when one looks at numerical examples, we shall omit the proof; the proof involves little 
more than an analysis of the positions of the O's and l's of R. 



EXAMPLE 5 Bases for Row and Column Spaces 



The matrix 



R = 



is in row-echelon form. From Theorem 5.5.6, the vectors 

11 = 
i 2 = 
i'3 = 

form a basis for the row space of R, and the vectors 

"l" 



-2 


5 





3" 


1 


3 














1 


















[1 

[0 
[0 



2 5 3] 
13 0] 
10] 



<\ 



*2- 



' -2 




"0" 


1 



«=4 




1 











form a basis for the column space of R. 



EXAMPLE 6 Bases for Row and Column Spaces 



Find bases for the row and column spaces of 



,4 = 



1 


-3 


4 


-2 


5 


4 


2 


-6 


9 


-1 


8 


2 


2 


-6 


9 


-1 


9 


7 


1 


3 


-4 


2 


-5 


-4 



Solution 



Since elementary row operations do not change the row space of a matrix, we can find a basis for the row space of A by finding a 
basis for the row space of any row-echelon form of A. Reducing A to row-echelon form, we obtain (verify) 



R = 



By Theorem 5.5.6, the nonzero row vectors of R form a basis for the row space of R and hence form a basis for the row space of A. 
These basis vectors are 



1 


-3 4 


-2 


5 


4 





1 


3 


-2 


-6 











1 


5 


















n = [i-3 4 

r 2 = [0 1 
r 3 = [0 



-2 5 4] 
3 -2 -6] 
1 5] 



Keeping in mind that A and R may have different column spaces, we cannot find a basis for the column space of A directly from 
the column vectors of R. However, it follows from Theorem 5.5.5b that if we can find a set of column vectors of R that forms a 
basis for the column space of R, then the corresponding column vectors of A will form a basis for the column space of A. 

The first, third, and fifth columns of R contain the leading l's of the row vectors, so 

5 

-2 

1 



form a basis for the column space of R; thus the corresponding column vectors of A — namely, 





T 


<'l = 














~4~ 


1 = 


1 









C<; = 



fl 



1 

2 

2 

-1 



C 3 : 



4 

9 

9 

-4 



*5 



5 

8 

9 

-5 



form a basis for the column space of A. 



EXAMPLE 7 Basis for a Vector Space Using Row Operations 



Find a basis for the space spanned by the vectors 

vi = (1, -2,0,0,3), 
v 3 = (0,5, 15,10,0), 



v 2 =(2, -5,-3, -2,6), 
v 4 =(2,6, 18,8,6) 



Solution 

Except for a variation in notation, the space spanned by these vectors is the row space of the matrix 

1 -2 



-3 
15 
18 






3~ 


-2 


6 


10 





8 


6 



Reducing this matrix to row-echelon form, we obtain 



2 


3~ 


1 3 2 





1 1 












1 




l o 

The nonzero row vectors in this matrix are 

wi = (1, - 2, 0, 0, 3), w 2 - (0, 1, 3, 2, 0), w 3 - (0, 0, 1, 1, 0) 
These vectors form a basis for the row space and consequently form a basis for the subspace of R 5 spanned by v\, v r 2, V3, and V4. 

Observe that in Example 6 the basis vectors obtained for the column space of A consisted of column vectors of A, but the basis 
vectors obtained for the row space of A were not all row vectors of A. The following example illustrates a procedure for finding a 
basis for the row space of a matrix A that consists entirely of row vectors of A. 



EXAMPLE 8 Basis for the Row Space of a Matrix 



Find a basis for the row space of 



consisting entirely of row vectors from A. 



,4 = 



1 


-2 





3 


2 


-5 


-3 


-2 6 





5 


15 


10 


2 


6 


18 


8 6 



Solution 

We will transpose A, thereby converting the row space of A into the column space of A T > then we will use the method of Example 
6 to find a basis for the column space of ^; and then we will transpose again to convert column vectors back to row vectors. 
Transposing A yields 



Reducing this matrix to row-echelon form yields 



A T = 



1 
-2 


3 



1 2 

1 









2 
-5 
-3 
-2 




5 

15 

10 





2 

10 
1 





2 
6 

18 
8 
6 



The first, second, and fourth columns contain the leading l's, so the corresponding column vectors in a t iorm a basis for the 



column space of a t \ these are 



?l 



1 
-2 


3 



<*2 : 



2 
-5 
-3 
-2 

6 



and 04 = 



2 
6 

18 
8 
6 



Transposing again and adjusting the notation appropriately yields the basis vectors 

ri = [1 -2 3], i'2=[2 -5 -3 
and 

r 4 =[2 6 18 8 6] 
for the row space of A. 



-2 6], 



We know from Theorem 5.5.5 that elementary row operations do not alter relationships of linear independence and linear 
dependence among the column vectors; however, Formulas 6 and 7 imply an even deeper result. Because these formulas actually 
have the same scalar coefficients x\, %2, ■■■ , x n , it follows that elementary row operations do not alter the formulas (linear 
combinations) that relate linearly dependent column vectors. We omit the formal proof. 



EXAMPLE 9 Basis and Linear Combinations 



(a) Find a subset of the vectors 

vi = (1, - 2, 0, 3), v 2 = (2, - 5, -3,6), 
v 3 = (0, 1,3, 0), v 4 =(2, -1,4, -7), v 5 = (5, -8, 1,2) 

that forms a basis for the space spanned by these vectors. 

(b) Express each vector not in the basis as a linear combination of the basis vectors. 



Solution (a) 

We begin by constructing a matrix that has v\, \ r 2, . . ., vj as its column vectors: 



-2 -5 

-3 

3 6 

t t 

V| v-. 



u 

I 

3 


t 



4 
-7 

t 



3 

-8 
1 

2 

t 



(8) 



The first part of our problem can be solved by finding a basis for the column space of this matrix. Reducing the matrix to reduced 
row-echelon form and denoting the column vectors of the resulting matrix by wj, v.'2, W3, W4, and v.-^ yields 



1 


II 


2 








1 


-1 








II 





1 





t t t T T 

*V| Wi Wl W4 \\h 

The leading l's occur in columns 1, 2, and 4, so by Theorem 5.5.6, 

{wi,W2,w 4 } 

is a basis for the column space of 9, and consequently, 

{vi,V2,v 4 } 

is a basis for the column space of 9. 

Solution (b) 

We shall start by expressing W3 and wj as linear combinations of the basis vectors wi, W2, W4. The simplest way of doing this is to 
express W3 and wj in terms of basis vectors with smaller subscripts. Thus we shall express W3 as a linear combination of wi and W2 
, and we shall express w^ as a linear combination of wi, W2, and W4. By inspection of 9, these linear combinations are 

W3 = 2w - i — w - 2 

wj = wi I W2 I W4 

We call these the dependency equations. The corresponding relationships in 8 are 

V3 = 2vi - v 2 

The procedure illustrated in the preceding example is sufficiently important that we shall summarize the steps: 

Given a set of vectors S = [ Ylj y 2? ___, y^.} in R n , the following procedure produces a subset of these vectors that forms a basis 
for span(S) and expresses those vectors of S that are not in the basis as linear combinations of the basis vectors. 

Step 1. Form the matrix A having v 1? v^ • • •» Yk as ^ ts c °l umn vectors. 

Step 2. Reduce the matrix A to its reduced row-echelon form /?, and let w\, u _ 2, . • ., wjt be the column vectors of R. 

Step 3. Identify the columns that contain the leading Ts in R. The corresponding column vectors of A are the basis vectors for 
span(S). 

Step 4. Express each column vector of R that does not contain a leading 1 as a linear combination of preceding column vectors 
that do contain leading Ts. (You will be able to do this by inspection.) This yields a set of dependency equations 
involving the column vectors of R. The corresponding equations for the column vectors of A express the vectors that 
are not in the basis as linear combinations of the basis vectors. 



Exercise Set 5.5 



o 



Click here for Just Ask! 



List the row vectors and column vectors of the matrix 



2 


-1 


1 


3 


5 7 


-1 


1 


4 2 


7 



Express the product Ax as a linear combination of the column vectors of A. 



(a) 



2 3 
■1 4 



(b) 




6 
1 



(c) 



-3 
5 
2 
1 



6 
-4 

3 

8 



(d) 



2 1 
6 3 



3 



-5 



Determine whether Z> is in the column space of A, and if so, express b as a linear combination of the column vectors of 



(a) 



A = 



1 3 
4 -6 



;b = 



-2 
10 



(b) 



.4 = 



"1 1 2 




"-1" 


1 1 


;b = 





2 1 3 




2 



(c) 



A = 



1 f 




5" 


3 1 


;b = 


1 


1 1 




-1 



(d) 



A = 



1 


-1 


f 




"2" 


1 


1 


-1 


;b = 





1 


-1 


1 








(e) 



A = 



12 1" 




'a' 


12 1 
12 13 


;b = 


3 
5 


12 2 




7 



Suppose that Xl — _ ], X2 = 2> *3 = 4> *4 = — 3 is a solution of a nonhomogeneous linear system Ay = b and that the solution 
4- set of the homogeneous system Ax = is given by the formulas 

*1 = — 3^+4^, x2=r — £, X3=r, *4 = s 



(a) Find the vector form of the general solution of Ax = 0- 



(b) Find the vector form of the general solution of Ax = b- 



Find the vector form of the general solution of the given linear system Ax = b; then use that result to find the vector form of the 
5. general solution of Ax = 0- 



(a) *i-3* 2 =l 
2*i — 6*2 = 2 

(b) *i f * 2 + 2* 3 =5 
*1 -h *3 = — 2 

2*i +*2 + 3*3 = 3 



(c) 



(d) 



*1 - 


-2*2 -h*3 + 2*4 = 


-1 


2*1 - 


-4*2 + 2*3+4*4 = 


-2 


-*l + 2*2 -*3~ 


■ 2*4 


= 1 


3*i- 


- 6*2 4- 3*3 + 6*4 = 


-3 


*l 


+ 2*2 — 3*3 


1 *4 


= 4 


— 2xi 


1 *2 + 2*3 


1 *4 = 


= -1 


-*l 


1 3*2 -*3 


1 2*4 


= 3 


4*1 


-7*2 


— 5*4 = 


-- -5 



Find a basis for the nullspace of A. 



(a) 



A = 



1-1 3 
5 -4 -4 
7-6 2 



(b) 



A = 



(c) 



A = 



(d) 



A = 



(e) 



A = 



2 


-1 










4 


-2 




















1 


4 5 


2" 








2 


1 3 











-1 


3 2 


2 








1 


4 


5 


6 


9 


3 


-2 


1 


4 


-1 


-1 





-1 


-2 


-1 


2 


3 


5 


7 


8 


1 


-3 


2 


2 


1 





3 


6 





-3 


2 


-3 


-2 


4 


4 


3 


-6 





6 


5 


-2 


9 




2 


-4 


-5 



7. 



In each part, a matrix in row-echelon form is given. By inspection, find bases for the row and column spaces of A. 



(a) 



(b) 



(c) 



(d) 



1 


2 











1 
















1 - 


3 


o" 







1 





























1 2 


4 


5 


1 


-3 








1 


-3 








1 











1 2 


-1 


5 


1 


4 


3 





1 


-7 












1 



For the matrices in Exercise 6, find a basis for the row space of A by reducing the matrix to row-echelon form. 



9. 



For the matrices in Exercise 6, find a basis for the column space of A. 



10. 



For the matrices in Exercise 6, find a basis for the row space of A consisting entirely of row vectors of A. 



11. 



Find a basis for the subspace of p^ spanned by the given vectors. 



(a) (1, 1, -4, -3), (2, 0, 2, -2), (2, -1, 3, 2) 



(b) (-1,1, -2,0), (3, 3, 6,0), (9, 0,0, 3) 



(c) (1, 1, 0, 0), (0, 0, 1, 1), (-2, 0, 2, 2), (0, -3, 0, 3) 



Find a subset of the vectors that forms a basis for the space spanned by the vectors; then express each vector that is not in the 
12. basis as a linear combination of the basis vectors. 



(a) vi = (1, 0, 1. 1), v 2 = ( - 3, 3, 1, 1), v 3 = ( - 1. 3, 9, 3), v 4 = ( - 5, 3, 5, - 1) 



(b) vi = (1. -2,0,3),v 2 = (2, -4,0,6),v 3 =(-l, 1, 2, 0), v 4 = (0, -1,2,3) 



(c) vi = (1. -l,5,2),v 2 = (-2,3, l,0),v 3 =(4, -5, 9,4), v 4 = (0,4. 2, - 3), v 5 = ( -7, 18, 2, -8) 



13. 



Prove that the row vectors of an n x n invertible matrix A form a basis for R } 



14. 



(a) Let 



"0 


1 


0" 


1 


















,4 = 



and consider a rectangular ^z-coordinate system in 3-space. Show that the nullspace of A 
consists of all points on the z-axis and that the column space consists of all points in the xy 
-plane (see the accompanying figure). 



(b) Find a 3 x 3 matrix whose nullspace is the x-axis and whose column space is the yz-plane. 



+ ■■ 



Nultapucc ol~4 



Column space 

Figure Ex-14 



15. 



Find a 3 x 3 matrix whose nullspace is 



(a) a point 



(b) aline 



(c) a plane 



Discussion 

DisCOV&rV Indicate whether each statement is always true or sometimes false. Justify your answer by giving a 



16. logical argument or a counterexample. 



(a) If E is an elementary matrix, then A and EA must have the same nullspace. 



(b) If E is an elementary matrix, then A and EA must have the same row space. 



(c) If E is an elementary matrix, then A and EA must have the same column space. 



(d) If Ax = h does not have any solutions, then b is not in the column space of A. 



(e) The row space and nullspace of an invertible matrix are the same. 



17. 



(a) Find all 2 x 2 matrices whose nullspace is the line 3* — 5^ = 0- 



(b) Sketch the nullspaces of the following matrices: 



,4 = 



1 4 
5 


B = 


1 
5 



c= 



6 2 
3 1 



D = 








The equation x\ +X2 + x 3 = ^ can ^ e v i ewe d as a linear system of one equation in three unknowns 
1°* Express its general solution as a particular solution plus the general solution of the corresponding 



homogeneous system. [Write the vectors in column form.] 

Suppose that A and B are nxn matrices and A is invertible. Invent and prove a theorem that 
19. describes how the row spaces of AB and B are related. 
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5.6 



In the preceding section we investigated the relationships between systems 
of linear equations and the row space, column space, and nullspace of the 
RAN K AN D N U LLITY coefficient matrix. In this section we shall be concerned with relationships 

between the dimensions of the row space, column space, and nullspace of a 
matrix and its transpose. The results we will obtain are fundamental and will 
provide deeper insights into linear systems and linear transformations. 



Four Fundamental Matrix Spaces 

If we consider a matrix A and its transpose j[ T together, then there are six vector spaces of interest: 

row space of A row space of ^ T 

column space of A column space of A T 

nullspace of A nullspace of j[ T 

However, transposing a matrix converts row vectors into column vectors and column vectors into row vectors, so except for a 
difference in notation, the row space of j[ T is the same as the column space of A, and the column space of A T is the same as the 
row space of A. This leaves four vector spaces of interest: 

row space of A column space of A 

nullspace of A nullspace of j[ T 

These are known as the fundamental matrix spaces associated with A. If A is an ^ x w matrix, then the row space of A and the 
nullspace of A are subspaces of R n , and the column space of A and the nullspace of A J are subspaces of R™. Our primary goal 
in this section is to establish relationships between the dimensions of these four vector spaces. 

Row and Column Spaces Have Equal Dimensions 

In Example 6 of Section 5.5, we found that the row and column spaces of the matrix 

1-3 4-2 5 4 

A = 



2 


-6 


9 


-1 


8 


2 


2 


-6 


9 


-1 


9 


7 


1 


3 


-4 


2 


-5 


-4 



each have three basis vectors; that is, both are three-dimensional. It is not accidental that these dimensions are the same; it is a 
consequence of the following general result. 

THEOREM 5.6.1 



If A is any matrix, then the row space and column space of A have the same dimension. 



Proof Let R be any row-echelon form of A. It follows from Theorem 5.5.4 that 

dim (row space of A) = dim(row space of R) 



and it follows from Theorem 5.5.5/? that 

dim(column space of A) = dim(column space oi'R) 

Thus the proof will be complete if we can show that the row space and column space of R have the 
same dimension. But the dimension of the row space of R is the number of nonzero rows, and the 
dimension of the column space of R is the number of columns that contain leading l's (Theorem 
5.5.6). However, the nonzero rows are precisely the rows in which the leading l's occur, so the 
number of leading l's and the number of nonzero rows are the same. This shows that the row space 

and column space of R have the same dimension. 

■ 

The dimensions of the row space, column space, and nullspace of a matrix are such important numbers that there is some 
notation and terminology associated with them. 



DEFINITION 



The common dimension of the row space and column space of a matrix A is called the rank of A and is denoted by rank(A); 
the dimension of the nullspace of A is called the nullity of A and is denoted by nullity (A). 



EXAMPLE 1 Rank and Nullity of a 4 > 6 Matrix 



Find the rank and nullity of the matrix 



,4 = 



1 


2 


4 


5 


-3 


3 


-7 2 





1 


4 


2 


-5 2 


4 


6 


1 


4 


-9 2 


-4 


-4 


7 



Solution 



The reduced row-echelon form of A is 



1 


-4 


-23 


-37 


13 


1 


-2 


-12 


-16 


5 

































(1) 



(verify). Since there are two nonzero rows (or, equivalently, two leading l's), the row space and column space are both 
two-dimensional, so rank (^4) — 2- To find the nullity of A, we must find the dimension of the solution space of the linear 
system Ax = 0- This system can be solved by reducing the augmented matrix to reduced row-echelon form. The resulting 
matrix will be identical to 1, except that it will have an additional last column of zeros, and the corresponding system of 
equations will be 

jti — 4* 3 -28*4 -37*3 I 13*^ = 
*2 — 2*3 — 12*4— 16*3 I 5*^ = 



or, on solving for the leading variables, 



*l = 4*3 + 28*4 + 37*3 — ^3*6 
*2 = 2*3 + 12*4+ 16*3 — 5*6 



(2) 



It follows that the general solution of the system is 



xi = 4r + 2Ss + 3l£ - I3u 

x 2 = 2r + 12s -h 16* -5u 
x 3 = r 
X4 = s 

x$=t 
x$ = u 



or, equivalently, 



~*l" 




V 




"28" 




"37" 




"-13" 


*2 




2 




12 




16 




-5 


*3 
x 4 


= r 


1 



+ s 



1 


+ t 






+ tt 






*5 














1 







*6 



















1 



Because the four vectors on the right side of 3 form a basis for the solution space, nullity (A) = 4- 



The following theorem states that a matrix and its transpose have the same rank. 



THEOREM 5.6.2 



Proof 



T T 

rank(j4) = dim(row space of A) = dim(column space of A ) = rank(j4 ) 



The following theorem establishes an important relationship between the rank and nullity of a matrix. 



(3) 




THEOREM 5.6.3 



Dimension Theorem for Matrices 

If A is a matrix with n columns, then 



rank(^) + nullity (A) = n 



(4) 



Proof Since A has n columns, the homogeneous linear system Ax = has n unknowns (variables). These fall into two 
categories: the leading variables and the free variables. Thus 



number of leading 
variables 



number of free 
variables 



= W 



But the number of leading variables is the same as the number of leading l's in the reduced 
row-echelon form of A r and this is the rank of A. Thus 

number of free 



rank 04) + 



variables 



= n 



The number of free variables is equal to the nullity of A. This is so because the nullity of A is the 
dimension of the solution space of Ax = Q, which is the same as the number of parameters in the 
general solution [see 3, for example], which is the same as the number of free variables. Thus 

rank(^) I nullity (A) = w 



The proof of the preceding theorem contains two results that are of importance in their own right. 



THEOREM 5.6.4 



If A is an wzxtt matrix, then 


of leading variables in 
t of parameters in the 


the solution of Ax = 0- 




(a) rank (A) = 


the number 






(b) nullity [A) 


= the numbe 


general solution of Ax = 


= 0- 







EXAMPLE 2 The Sum of Rank and Nullity 



The matrix 



1 


2 


4 


5 


-3 


3 


-7 2 





1 


4 


2 


-5 2 


4 


6 


1 


4 


-9 2 


-4 


-4 


7 



A = 



has 6 columns, so 

rank(^) + nullity (A) = 6 
This is consistent with Example 1, where we showed that 

rank(^) =2 and nullity (A) = 4 



EXAMPLE 3 Number of Parameters in a General Solution 



Find the number of parameters in the general solution of ^4x = if A is a 5 x 7 matrix of rank 3. 



Solution 

From 4, 



nullity (A) = m - rank(,4) =7-3=4 
Thus there are four parameters. 

Suppose now that A is an ^ x n matrix of rank r; it follows from Theorem 5.6.2 that ^ 7 is an n x ra matrix of rank r. Applying 

Theorem 5.6.3 to A and ^ r yields 

T 
nullity ( j4) = w — r, nullity (j4 )=m — r 

from which we deduce the following table relating the dimensions of the four fundamental spaces of an mX ft matrix A of rank 
r. 



Fundamental Space Dimension 



Row space of A 


r 


Column space of A 


r 


Nullspace of A 


n—r 


Nullspace of A T 


m —r 



Applications of Rank 

The advent of the Internet has stimulated research on finding efficient methods for transmitting large amounts of digital 
data over communications lines with limited bandwidth. Digital data is commonly stored in matrix form, and many 
techniques for improving transmission speed use the rank of a matrix in some way. Rank plays a role because it measures 
the "redundancy" in a matrix in the sense that if A is an m x w matrix of rank k, then h — k of the column vectors and m—k 
of the row vectors can be expressed in terms ofk linarly independent column or row vectors. The essential idea in many 
data compression schemes is to approximate the original data set by a data set with smaller rank that conveys nearly the 
same information, then eliminate redundant vectors in the approximating set to speed up the transmission time. 



Maximum Value for Rank 

If A is an m x n matrix, then the row vectors lie in R n and the column vectors lie in R m . This implies that the row space of A is 
at most rc-dimensional and that the column space is at most m-dimensional. Since the row and column spaces have the same 
dimension (the rank of A), we must conclude that if m ^ #, then the rank of A is at most the smaller of the values of m and n. 
We denote this by writing 

rank(y4) < minfra, n) .-. 

where min(ra, n) denotes the smaller of the numbers m and n if m ^ n or denotes their common value if m = M . 



EXAMPLE 4 Maximum Value of Rank for a 7 4 Matrix 



If A is a 7 x 4 matrix, then the rank of A is at most 4, and consequently, the seven row vectors must be linearly dependent. If A 
is a 4 x 7 matrix, then again the rank of A is at most 4, and consequently, the seven column vectors must be linearly dependent. 



Linear Systems of m Equations in n Unknowns 

In earlier sections we obtained a wide range of theorems concerning linear systems of n equations in n unknowns. (See 
Theorem 4.3.4.) We shall now turn our attention to linear systems of m equations in n unknowns in which m and n need not be 
the same. 

The following theorem specifies conditions under which a linear system of m equations in n unknowns is guaranteed to be 
consistent. 

THEOREM 5.6.5 



The Consistency Theorem 










//Jx = 


z ti is a linear system of m 


equations in n unknowns, 


then 


the following 


are equivalent. 


(a) 


Ax = b is consistent. 










(b) 
(c) 


b is in the column space of A. 

The coefficient matrix A and the augmented matrix 






same rank. 


[A 


| b ] have the 







Proof It suffices to prove the two equivalences (#) < > (£) and (£) < > (c), since it will then follow as a matter of logic that 
(a) o (c). 

{a} & [b) See Theorem 5.5.1. 

(fa) => [c] We will show that if b is in the column space of A, then the column spaces of A and [A | b] are actually the same, 
from which it will follow that these two matrices have the same rank. 

By definition, the column space of a matrix is the space spanned by its column vectors, so the column spaces of A and 
[A | b ] can be expressed as 

span{ci, C2 ? ---, c H ) and span{ci, ^2, ■-■> *"h> M 
respectively. If b is in the column space of A, then each vector in the set (^ 1? ^ 2, . . ., c M , b) is a linear combination of the 
vectors in ( Cl? t 2 , . - ., c H ) and conversely (why?). Thus, from Theorem 5.2.4, the column spaces of A and [A | b] are the 
same. 

fc) => ih) Assume that A and [A | b ] have the same rank r. By Theorem 5.4.6a, there is some subset of the column vectors of 
A that forms a basis for the column space of A. Suppose that those column vectors are 

c 1? c 2? -.. ? c r 

These r basis vectors also belong to the r-dimensional column space of [A | b] ; hence they also form a basis for the column 
space of [A | b] by Theorem 5.4.6a. This means that b is expressible as a linear combination of c ' , c ' . ., c > , and 

consequently b lies in the column space of A. 

■ 

It is not hard to visualize why this theorem is true if one views the rank of a matrix as the number of nonzero rows in its 



IS 



reduced row-echelon form. For example, the augmented matrix for the system 

*l — 2*2 — 3*3 I 2*4 = — 4 

— 3*i I 7*2 — *3 +*4= — 3 
2*i — 5*2 I 4*3 — 3*4 = 7 

— 3*i 4- 6*2 4- 9*3 — 6*4 = — 1 

which has the following reduced row-echelon form (verify): 

1 0-23 



1 


-2 


-3 


-3 


7 


-1 


2 


-5 


4 


-3 


6 


9 






1 - 


-10 





















16 


0" 


7 








1 









2 


-A 


1 


-3 


-3 


7 


-6 


-1 



We see from the third row in this matrix that the system is inconsistent. However, it is also because of this row that the reduced 
row-echelon form of the augmented matrix has fewer zero rows than the reduced row-echelon form of the coefficient matrix. 
This forces the coefficient matrix and the augmented matrix for the system to have different ranks. 

The Consistency Theorem is concerned with conditions under which a linear system Ax = h is consistent for a specific vector b. 
The following theorem is concerned with conditions under which a linear system is consistent for all possible choices of b. 



THEOREM 5.6.6 



IfAx = h is a linear system ofm equations in n unknowns, then the following are equivalent. 



( a ) Ax — h is consistent for every mx\ matrix b. 



(b) The column vectors of A span R m . 



(c) rank(>4) = m- 



Proof It suffices to prove the two equivalences (#) <& (£) and (#) < > (c), since it will then follow as a matter of logic that 

(b) «* (c). 

{a} & [b) From Formula 2 of Section 5.5, the system Ax = h can be expressed as 

*ici I *2C2 + "" + *H C H=b 

from which we can conclude that Ax = h is consistent for every m x 1 matrix b if and only if every such b is expressible as a 
linear combination of the column vectors ci, C2,--- 5 c M , or, equivalently, if and only if these column vectors span R™. 

[a] => [c] From the assumption that Ax = h is consistent for every m x 1 matrix b, and from parts (a) and (b) of the Consistency 
Theorem (Theorem 5.6.5), it follows that every vector b in R™ lies in the column space of A; that is, the column space of A is 
all of R™. Thus rank(,4) = tim(R m ) = m. 

(c) => [a) From the assumption that rank(j4) = m, it follows that the column space of A is a subspace of R™ of dimension m and 
hence must be all of R™ by Theorem 5.4.7. It now follows from parts (a) and (b) of the Consistency Theorem (Theorem 5.6.5) 
that Ax = h is consistent for every vector b in R™, since every such b is in the column space of A. 



A linear system with more equations than unknowns is called an over determined linear system. If Ax = h is an overdetermined 



linear system of m equations in n unknowns (so that m > #), then the column vectors of A cannot span R m \ it follows from the 
last theorem that for a fixed mxtt matrix A with m > #, the over determined linear system As. = h cannot be consistent for every 
possible b. 



EXAMPLE 5 An Overdetermined System 



The linear system 

x\ - 2x 2 =b\ 
X{ - x 2 =b 2 
x\ + ^2 =*3 
x\ + 2x 2 =^4 
x\ + 3x 2 =b$ 

is overdetermined, so it cannot be consistent for all possible values of £ 1? £ 2 , £ 3 , £ 4 , and fry Exact conditions under which the 
system is consistent can be obtained by solving the linear system by Gauss-Jordan elimination. We leave it for the reader to 
show that the augmented matrix is row equivalent to 

1 2i 2 - Ai" 

1 h- b\ 

b 3 -3b 2 I 2*i 

b 4 -4b 2 I 3*i 

b 5 -5b 2 I 4*i 

Thus, the system is consistent if and only if * 1? * 2 , * 3 , b# an d b$ satisfy the conditions 

2*i-3*2 1*3 =0 



3*i-4*2 
4*i-5*2 



■* 4 =0 

+ b 5 = 



or, on solving this homogeneous linear system, 



b\=5r — 4s, b 2 = 4r—3s, bi = 2r — s, &4 = r ? b$ = s 

where r and s are arbitrary. 

4 

In Formula 3 of Theorem 5.5.2, the scalars c\,c 2 , ... , c^ are the arbitrary parameters in the general solutions of both Ax = b 
and Ax = 0- Thus these two systems have the same number of parameters in their general solutions. Moreover, it follows from 
part (b) of Theorem 5.6.4 that the number of such parameters is nullity(A). This fact and the Dimension Theorem for Matrices 
(Theorem 5.6.3) yield the following theorem. 



THEOREM 5.6.7 



If Ax = h is a consistent linear system ofm equations in n unknowns \ and if A has rank r, then the general solution of the 
system contains n — r parameters. 



EXAMPLE 6 Number of Parameters in a General Solution 



If A is a 5 x 7 matrix with rank 4, and if Ay = b is a consistent linear system, then the general solution of the system contains 



7 — 4 = 3 parameters. 

♦ 

In earlier sections we obtained a wide range of conditions under which a homogeneous linear system Ax = of n equations in n 
unknowns is guaranteed to have only the trivial solution. (See Theorem 4.3.4.) The following theorem obtains some 
corresponding results for systems of m equations in n unknowns, where m and n may differ. 

THEOREM 5.6.8 



If A is an mxn matrix, then the following are equivalent. 

(a) Ax. = has only the trivial solution. 

(b) The column vectors of A are linearly independent. 

( c ) Ax = h has at most one solution (none or one) for every mxl matrix b. 



Proof It suffices to prove the two equivalences (a) < > (b) and (a) < > (e), since it will then follow as a matter of logic that 

(b) ** (€)• 

[a] ^ [b) If c i , <?2' • • • ' c h are ^e column vectors of A, then the linear system Ax = can be written as 

*1C1+*2C2+" - + * H c H =0 (6) 

If c i , C2 ? • • • , c H are linearly independent vectors, then this equation is satisfied only by^ 1 =x2='- = x n = Q> which means that 
Ax = has only the trivial solution. Conversely, if Ax = has only the trivial solution, then Equation 6 is satisfied only by 
xi = X2 = - = * H = 0> which means that cj, C2 ? • • • , c H are linearly independent. 

[a] => (c) Assume that Ax = has only the trivial solution. Either Ax = h is consistent or it is not. If it is not consistent, then 
there are no solutions of Ax = b> and we are done. If Ax = h is consistent, let xq be any solution. From the discussion following 
Theorem 5.5.2 and the fact that Ax = has only the trivial solution, we conclude that the general solution of Ax = h is 
xg I = xq- Thus the only solution of Ax = h is xq. 

(c) => [a) Assume that Ax = h has at most one solution for every mx\ matrix b. Then, in particular, Ax = h has at most one 
solution. Thus Ax = has only the trivial solution. 

■ 

A linear system with more unknowns than equations is called an underdetermined linear system. If Ax = h is a consistent 
underdetermined linear system of m equations in n unknowns (so that m < ^), then it follows from Theorem 5.6.7 that the 
general solution has at least one parameter (why?); hence a consistent underdetermined linear system must have infinitely many 
solutions. In particular, an underdetermined homogeneous linear system has infinitely many solutions, though this was already 
proved in Chapter 1 (Theorem 1.2.1). 



EXAMPLE 7 An Underdetermined System 



If A is a 5 x 7 matrix, then for every 7x1 matrix b, the linear system Ax = h is underdetermined. Thus Ax = h must be 
consistent for some b, and for each such b the general solution must have 7 _ r parameters, where r is the rank of A. 

Summary 

In Theorem 4.3.4 we listed eight results that are equivalent to the invertibility of a matrix A. We conclude this section by 
adding eight more results to that list to produce the following theorem, which relates all of the major topics we have studied 
thus far. 

THEOREM 5.6.9 



■n 



Equivalent Statements 

If A is an n x n matrix, and ifTji : R n > R n is multiplication by A, then the following are equivalent. 

(a) A is invertible. 

(b) Ax = has only the trivial solution. 

(c) The reduced row-echelon form of A is j 

(d) A is expressible as a product of elementary matrices. 

( e ) Ax. = h is consistent for every MX 1 matrix b. 

(f) Ax = h has exactly one solution for every # x 1 matrix b. 

(g) detC-d) * 0- 
(h) The range ofT^ is R n . 
(i) Ta ^ one-to-one. 

(j) The column vectors of A are linearly independent. 

(k) The row vectors of A are linearly independent. 

(1) The column vectors of A span R n . 



(m) 


The row vectors of A span R n . 




(n) 


The column vectors of A form a 


basis for 


R n . 


(o) 


The row vectors of A form a basis for R n . 




(P) 


A has rank n. 




(q) 


A has nullity 0. 





Proof We already know from Theorem 4.3.4 that statements (a) through (/ ) are equivalent. To complete the proof, we will 
show that (/) through (q) are equivalent to (b) by proving the sequence of implications 

(A) =* 0") =* (*) => 00 => («) =* («) => Co) => 0?) =* to) =*■ (A)- 

(b) => ( /} If Ak = has only the trivial solution, then by Theorem 5.6.8, the column vectors of A are linearly independent. 

(j\ ^ {k) ^ ff) ^ (m) ^ {/?) * (o) This follows from Theorem 5.4.5 and the fact that R n is an ^-dimensional vector space. 
(The details are omitted.) 

(q) =$ (p) If the n row vectors of A form a basis for R n , then the row space of A is ^-dimensional and A has rank n. 

(p) ^ fa] This follows from the Dimension Theorem (Theorem 5.6.3). 

{q) => {h) If A has nullity 0, then the solution space of Ax = has dimension 0, which means that it contains only the zero 
vector. Hence Ax = has only the trivial solution. 



Exercise Set 5.6 



Click here for Just Ask! 



Verify that rank (A) = rank 04 i ) . 

"12 4 0" 
A -3152 
-2392 

Find the rank and nullity of the matrix; then verify that the values obtained satisfy Formula 4 of the Dimension Theorem. 



(a) 



A = 



1 


-1 


3~ 


5 


-4 


-4 


7 


-6 


2 



(b) 



,4 = 



2 


-1" 


4 


-2 









(c) 



,4 = 



14 5 2 
2 13 
13 2 2 



(d) 



A = 



(e) 



A = 



1 


4 


5 


6 


3 


-2 


1 


4 


1 





-1 


-2 


2 


3 


5 


7 


1 


-3 


2 


2 





3 


6 





2 


-3 


-2 


4 


3 


-6 





6 


2 


9 


2 


-4 



In each part of Exercise 2, use the results obtained to find the number of leading variables and the number of parameters in 

3. the solution of Ax = without solving the system. 

In each part, use the information in the table to find the dimension of the row space, column space, and nullspace of A, and 

4. of the nullspace of j[ T . 





(a) 


(b) 


(c) 


(d) 


(e) 


(f) 


(g) 


Size of A 
Rank(A) 


3x3 
3 


3x3 

2 


3x3 
1 


5x9 

2 


9x5 

2 


4x4 



6x2 

2 



In each part, find the largest possible value for the rank of A and the smallest possible value for the nullity of 



(a) A is 4x4 

(b) A is 3 x 5 



(c) A is 5 x 3 



If A is an ^ x ^ matrix, what are the largest possible value for its rank and the smallest possible value for its nullity? 
Hint See Exercise 5. 



In each part, use the information in the table to determine whether the linear system ^x = h is consistent. If so, state the 
7. number of parameters in its general solution. 







(a) 


(b) 


(c) 


(d) 


(e) 


(f) 


(g) 


Size of A 




3x3 


3x3 


3x3 


5x9 


5x9 


4x4 


6x2 


Rank(A) 




3 


2 


1 


2 


2 





2 


Rank [A 


b] 


3 


3 


1 


2 


3 





2 



For each of the matrices in Exercise 7, find the nullity of A, and determine the number of parameters in the general solution 
8. of the homogeneous linear system jkt = 0- 



9. 



What conditions must be satisfied by £ 1? £ 2 , £ 3 , £ 4 , and ^ for the overdetermined linear system 

*1 —3x2 = b\ 
xi -2x 2 = b 2 
xi I x 2 =b 3 
xi -4*2= £4 
*1 + 5x2 = b$ 
to be consistent? 



Let 



10. 



,4 = 



a n a\2 a\3 
^21 «22 ^23 



Show that A has rank 2 if and only if one or more of the determinants 



a n 


a\1 




a\\ 


a\2 




a n ^13 


<*2\ 


a 22 


? 


®2\ 


<*23 


p 


<*22 «23 



are nonzero. 



Suppose that A is a 3 x 3 matrix whose nullspace is a line through the origin in 3-space. Can the row or column space of A 
11- also be a line through the origin? Explain. 



Discuss how the rank of A varies with 



12. 



(a) 



A = 



(b) 



A = 



1 1 


t 






1 t 


1 






t 1 


1 






t 


3 


-1 


3 


6 


-2 


-1 


— 


3 


£ 



13. 



14. 



Are there values of r and s for which 

10 

r-2 2 

s-l r+2 
3 

has rank 1 or 2? If so, find those values. 

Use the result in Exercise 10 to show that the set of points (x,y,z) in F? for which the matrix 

^x y z^ 
1 x y 

has rank 1 is the curve with parametric equations x = t,y = £iz = p- 



15. 



Prove: If £ ^ Q, then A and kA have the same rank. 



Discussion 

Discovery 



(a) Give an example of a 3 x 3 matrix whose column space is a plane through the origin in 
3-space. 



(b) What kind of geometric object is the nullspace of your matrix? 



(c) What kind of geometric object is the row space of your matrix? 



(d) In general, if the column space of a 3 x 3 matrix is a plane through the origin in 3-space, 
what can you say about the geometric properties of the nullspace and row space? Explain 
your reasoning. 



Indicate whether each statement is always true or sometimes false. Justify your answer by givinj 
17. a logical argument or a counterexample. 



(a) If A is not square, then the row vectors of A must be linearly dependent. 



18. 



19. 



(b) If A is square, then either the row vectors or the column vectors of A must be linearly 
independent. 



(c) If the row vectors and the column vectors of A are linearly independent, then A must be 
square. 



(d) Adding one additional column to a matrix A increases its rank by one. 



(a) If A is a 3 x 5 matrix, then the number of leading l's in the reduced row-echelon form of 
A is at most . Why? 



(b) If A is a 3 x 5 matrix, then the number of parameters in the general solution of Ax = is 
at most . Why? 



(c) If A is a 5 x 3 matrix, then the number of leading l's in the reduced row-echelon form of 
A is at most . Why? 



(d) If A is a 5 x 3 matrix, then the number of parameters in the general solution of Ax = is 
at most . Why? 



(a) If A is a 3 x 5 matrix, then the rank of A is at most . Why? 

(b) If A is a 3 x 5 matrix, then the nullity of A is at most . Why? 

(c) If A is a 3 x 5 matrix, then the rank of ^ ^ is at most . Why? 

(d) If A is a 3 x 5 matrix, then the nullity of A T is at most . Why? 
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Abbreviations 

C cyan K black M magenta Y yellow 
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Chapter 5 



Supplementary Exercises 

In each part, the solution space is a subspace of jfi and so must be a line through the origin, a plane through the origin, all 
of i? 3 , or the origin only. For each system, determine which is the case. If the subspace is a plane, find an equation for it, 
and if it is a line, find parametric equations. 

(a) Ox I 0y + 0z=0 

(b) 2x-3y + z = 
6x - 9y 4- 3z = 

-Ax I 6y - 2z = 

(c) x-2y + lz = 
-Ax I Sy I 5z = 

2x - Ay + 3z = 

(d) x I Ay + Sz = 
2x + 5y + 6z = 
3x+ y - Az = 



For what values of s is the solution space of 

2 

*1 +X2+SX2 = 

x\ +SX2 + xj = 

£*! +*2 +^3 = 

the origin only, a line through the origin, a plane through the origin, or all of j^l 



3. 

(a) Express {/\a, a — b,a | 2b) as a linear combination of (4, 1, 1) and (0,-1, 2). 



(b) Express (3a + b + 3c, -a | Ab - c, 2a | b I 2c) as a linear combination of (3, -1, 2) and (1, 4, 1). 

(c) Express (2a —b I Ac, 3a — c, Ab I c) as a linear combination of three nonzero vectors. 

Let W be the space spanned by f = sin^ and g = cos x. 
4. 



(a) Show that for any value of 0, f 1 = S m(x + 9) and g± = cos (x + 9) are vectors in W. 



(b) Show that f 1 and g 1 form a basis for W. 



5. 



(a) Express v=(l,l)asa linear combination of vi = (1, — 1), \'2 = (3, 0), and y 3 = (2, 1) * n two different ways. 



(b) Explain why this does not violate Theorem 5.4.1. 



Let A be an ^ x w matrix, and let y\ 9 \ F 2, • • • v n be linearly independent vectors in R n expressed as M x 1 matrices. What 
6- must be true about A for ^v^ Av-^ • • •> j4v h to be linearly independent? 



7. 



Must a basis for p^ contain a polynomial of degree k for each £ = 0, 1,2, . . ., n? Justify your answer. 



8. 



For purposes of this problem, let us define a "checkerboard matrix" to be a square matrix A = [ay] suc h that 

( 1 if j | j is even 
^ |0 ifz + jisodd 

Find the rank and nullity of the following checkerboard matrices: 



(a) the 3 x 3 checkerboard matrix 



(b) the 4 x 4 checkerboard matrix 



(c) the M x a checkerboard matrix 



9. 



For purposes of this exercise, let us define an " X-matrix" to be a square matrix with an odd number of rows and column 
that has O's everywhere except on the two diagonals, where it has l's. Find the rank and nullity of the following 
X-matrices: 



(a) 



"1 





f 





1 





1 





1 



(b) 



1 











f 





1 





1 











1 











1 





1 





1 











1 



(c) the X-matrix of size (2« + 1) x (2m 4- 1) 



10. 



In each part, show that the set of polynomials is a subspace of p n and find a basis for it. 



(a) all polynomials in p^ such that p ( _ x ) = p(x) 



(b) all polynomials in p n such that ^(0) = 



11. (For Readers Who Have Studied Calculus) Show that the set of all polynomials in p n that have a horizontal tangent 
at x = is a subspace of p . Find a basis for this subspace. 



12. 



(a) Find a basis for the vector space of all 3x3 symmetric matrices. 



(b) Find a basis for the vector space of all 3x3 skew- symmetric matrices. 



In advanced linear algebra, one proves the following determinant criterion for rank: The rank of a matrix A is r if and 
13- only if A has some rxr submatrix with a nonzero determinant, and all square submatrices of larger size have 

determinant zero. (A submatrix of A is any matrix obtained by deleting rows or columns of A. The matrix A itself is also 
considered to be a submatrix of A.) In each part, use this criterion to find the rank of the matrix. 



(a) 



1 2 

2 4-1 



(b) 



1 2 3 

2 4 6 



(c) 



1 


o r 


2 


-1 3 


3 


-1 4 



(d) 



1 -12 

3 10 

-12 4 



14. 



Use the result in Exercise 13 to find the possible ranks for matrices of the form 



ai 6 

a 2 6 

a 36 

atb 

a r A &52 <X53 $54 ^55 ^56 



Prove: If 5 is a basis for a vector space V, then for any vectors u and v in V and any scalar k, the following relationships 
15. hold: 



(a) (u I v) s =(n) s | (y) s 



(b) (hi)^k(u) s 
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Chapter 5 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Section 5.2 



Tl. 



(a) Some technology utilities do not have direct commands for finding linear combinations of vectors in R n . However, 
you can use matrix multiplication to calculate a linear combination by creating a matrix A with the vectors as columns 
and a column vector x with the coefficients as entries. Use this method to compute the vector 

v = 6(8. -2,1, -4) i 17(-3,9,ll,6)-9(0, -1,2,4) 
Check your work by hand. 

(b) Use your technology utility to determine whether the vector (9, 1, 0) is a linear combination of the vectors (1, 2, 3), (1, 

4, 6), and (2, -3, -5). 



Section 5.3 

Use your technology utility to perform the Wronskian test of linear independence on the sets in Exercise 20. 
Tl. 

Section 5.4 

Tl. (Linear Independence) Devise three different procedures for using your technology utility to determine whether a set of n 
vectors in R n is linearly independent, and use all of your procedures to determine whether the vectors 

vi =(4, -5,2,6), v ? .= (2, -2,1,3), v ? = (6, -3,3,9), v 4 =(4, -1,5,6) 
are linearly independent. 

T2. (Dimension) Devise three different procedures for using your technology utility to determine the dimension of the subspace 
spanned by a set of vectors in R n , and use all of your procedures to determine the dimension of the subspace of R 5 spanned 
by the vectors 

vi = (2, 2, - 1, 0, 1), v 2 = ( - 1, - 1, 2, - 3, 1), 
v 3 = (U, -2,0, -1), v 4 = (0,0, 1,1,1) 

Section 5.5 



Tl. (Basis for Row Space) Some technology utilities provide a command for finding a basis for the row space of a matrix. If 
your utility has this capability, read the documentation and then use your utility to find a basis for the row space of the matrix 
in Example 6. 



T2. (Basis for Column Space) Some technology utilities provide a command for finding a basis for the column space of a 

matrix. If your utility has this capability, read the documentation and then use your utility to find a basis for the column space 
of the matrix in Example 6. 



T3. (Nullspace) Some technology utilities provide a command for finding a basis for the nullspace of a matrix. If your utility has 
this capability, read the documentation and then check your understanding of the procedure by finding a basis for the 
nullspace of the matrix A in Example 4. Use this result to find the general solution of the homogeneous system jfa = 0- 

Section 5.6 

Tl. (Rank and Nullity) Read your documentation on finding the rank of a matrix, and then use your utility to find the rank of 
the matrix A in Example 1. Find the nullity of the matrix using Theorem 5.6.3 and the rank. 

There is a result, called Sylvester's inequality, which states that if A and B are n x n matrices with rank rj\ and rg, 
T2. respectively, then the rank rj$ of AB satisfies the inequality rA \ rB _ M < rj ^ < min(>^ rg), where ( rA rg) denotes the 
smaller of r^ and rg or their common value if the two ranks are the same. Use your technology utility to confirm this result 
for some matrices of your choice. 
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6 



CHAPTER 



Inner Product Spaces 



INTRODUCTION: In Section 3.3 we defined the Euclidean inner product on the spaces R 2 and R 3 . Then, in Section 4.1, 
we extended that concept to R n and used it to define notions of length, distance, and angle in R n . In this section we shall 
extend the concept of an inner product still further by extracting the most important properties of the Euclidean inner product on 
R n and turning them into axioms that are applicable in general vector spaces. Thus, when these axioms are satisfied, they will 
produce generalized inner products that automatically have the most important properties of Euclidean inner products. It will 
then be reasonable to use these generalized inner products to define notions of length, distance, and angle in general vector 
spaces. 
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6.1 

INNER PRODUCTS 



In this section we shall use the most important properties of the Euclidean inner 
product as axioms for defining the general concept of an inner product. We will 
then show how an inner product can be used to define notions of length and 
distance in vector spaces other than R n . 



General Inner Products 

In Section 4.1 we denoted the Euclidean inner product of two vectors in R n by the notation u . v . It will be convenient in this 
section to introduce the alternative notation (u, v) for the general inner product. With this new notation, the fundamental properties 
of the Euclidean inner product that were listed in Theorem 4.1.2 are precisely the axioms in the following definition. 



DEFINITION 



An inner product on a real vector space Vis a function that associates a real number (u, v) with each pair of vectors u and v in 
V in such a way that the following axioms are satisfied for all vectors u , v, and z in V and all scalars k. 

1. < u > v > = < v > u > [Symmetry axiom] 

2. (u I v,z) = (u,z) I (v,z) [Additivity axiom] 

3. {ku., v) = k{\\, v) [Homogeneity axiom] 

4. {v, y) > [Positivity axiom] 

and (y, y) = Q 

if and only if v = 

A real vector space with an inner product is called a real inner product space. 



Remark In Chapter 10 we shall study inner products over complex vector spaces. However, until that time we shall use the term 
inner product space to mean "real inner product space." 

Because the inner product axioms are based on properties of the Euclidean inner product, the Euclidean inner product satisfies 
these axioms; this is the content of the following example. 



EXAMPLE 1 Euclidean Inner Product on R n 



If u = (ui 9 u 2, ---> u n) and v = (vi, V2, ---, v„) are vectors in ^", then the formula 

{u, v) =11- v = u\ vi I ^2 V 2H \~ u n v n 

defines (u, v) to be the Euclidean inner product on R™. The four inner product axioms hold by Theorem 4.1.2. 



The Euclidean inner product is the most important inner product on R n . However, there are various applications in which it is 
desirable to modify the Euclidean inner product by weighting its terms differently. More precisely, if 

are positive real numbers, which we shall call weights, and ifu=(u\, u 2 , ..., u n ) and v = (y 1? v 2 , ..., v n ) are vectors in R n , then it 
can be shown (Exercise 26) that the formula 



(u, v)=™i^i vi I W2^2^2 + ""^h^h^ 



H 



(1) 



defines an inner product on R n ; it is called the weighted Euclidean inner product with weights w\ 9 w>2, • ••> w H . 



To illustrate one way in which a weighted Euclidean inner product can arise, suppose that some physical experiment can produce 
any of n possible numerical values 

x h x 2 ,...,x n 

and that a series of m repetitions of the experiment yields these values with various frequencies; that is, x\ occurs f 1 times, x 2 
occurs f 2 times, and so forth. Since there are a total of m repetitions of the experiment, 

Thus the arithmetic average, or mean, of the observed numerical values (denoted by x) is 

- /l*l I /2*2 » "' I / H *h _ 1 (ii ri / nrn4 ■ / r ^ 

K - / 1+ / 2+ ... + / H ~m^^ ' /2*2+™ + /h*h) (2) 

If we let 

^1—^2 — "■" — W H = 1 / W3 

then 2 can be expressed as the weighted inner product 

x = {f,x.)=w\f\x\ I W2/2*2+- + w M /h*h 

Remark It will always be assumed that ,£" has the Euclidean inner product unless some other inner product is explicitly specified. 
As defined in Section 4.1, we refer to R n with the Euclidean inner product as Euclidean n-space. 



EXAMPLE 2 Weighted Euclidean Inner Product 



Let u = (mi, u 2 ) an d v = (y\ ? V2) be vectors in R 2 . Verify that the weighted Euclidean inner product 

[\\, v) = 3u\ vi H- 2^2 V2 
satisfies the four inner product axioms. 

Solution 

Note first that if u and v are interchanged in this equation, the right side remains the same. Therefore, 

(11, v) = {v, 11) 

Ifz= (zi,z 2 )>then 

(u + v,z} = 3(wi +vi)zi + 2(w2 + v 2 )z2 
= (3u\zi 4- 2u 2 z 2 ) I (3vizi I 2v 2 z 2 ) 
= (u, z) + (v, z} 

which establishes the second axiom. 



Next, 

{kit, v) = 3(ku\)v\ I 2(ku2)v2 = k(3u\ v\ I 2z/2 v 2) — ^( u > v ) 
which establishes the third axiom. 



Finally, 



{v,v) = 3v\v\ I 2v2 V2 = 3v^ -f 2v2 



Obviously, ^ v j = 3 V 2 , 2 V | > 0- Further, ^ v j = 3 V 2 ! 2v | = q if and only if V{ = V2 = Q— that is, if and only if 
v = (vi, V2) = 0- Thus the fourth axiom is satisfied. 

Length and Distance in Inner Product Spaces 

Before discussing more examples of inner products, we shall pause to explain how inner products are used to introduce notions of 
length and distance in inner product spaces. Recall that in Euclidean n-space the Euclidean length of a vector u = (u\, uj, ---, u n ) 
can be expressed in terms of the Euclidean inner product as 

||u|| = Cu-u) 1/2 
and the Euclidean distance between two arbitrary points w= (u\,U2, ...,u n ) and v=(vi,V2,---,v M ) can be expressed as 

d(\\, v) = ||u - v|| = [ (u - v) - (u - v) ] 1/2 
[see Formulas 1 and 2 of Section 4.1]. Motivated by these formulas, we make the following definition. 



DEFINITION 



If V is an inner product space, then the norm (or length) of a vector u in Vis denoted by ||u|| and is defined by 

||u|| = {u,u) 1/2 
The distance between two points (vectors) u and v is denoted by d (u, v) and is defined by 

d(\\ 7 v) = ||u — v|| 



If a vector has norm 1, then we say that it is a unit vector. 



EXAMPLE 3 Norm and Distance in fl n 



If u = (u\, U2, ..., w H ) an d v = (vi, V2 ? ..., v H ) are vectors in R n with the Euclidean inner product, then 

||u|| = {u,u) 1/2 = (u-u) 1/2 =^ , „2 , ~ , u 2 

and 

,1/2 



d (u, v) = ||u — v|| = u — v, u — v = [ (u — v) - (u — v) ] 



1/2 



Observe that these are simply the standard formulas for the Euclidean norm and distance discussed in Section 4.1 [see Formulas 1 
and 2 in that section]. 



EXAMPLE 4 Using a Weighted Euclidean Inner Product 



It is important to keep in mind that norm and distance depend on the inner product being used. If the inner product is changed, then 
the norms and distances between vectors also change. For example, for the vectors u = (1, 0) and v = (0, 1) i n R 2 with the 
Euclidean inner product, we have 

Hull = l/l 2 + 2 = 1 
and 

rf(n,v) = ||u-v|| = ||(l. -l)|| = l/l 2 | (-l) 2 = /2 
However, if we change to the weighted Euclidean inner product of Example 2, 

then we obtain 

||u|| = <u,u) 1/2 =[3(l)(l) I 2(0)(0)] 1/2 = ,/3 
and 

a?(u,v) = ||u-v|| = ((l, -1), (1, -1)) 1/2 

= [3(1)(1) I 2(-l)(-l)] 1/2 = ^5 

Unit Circles and Spheres in Inner Product Spaces 

If V is an inner product space, then the set of points in V that satisfy 

Hull = l 

is called the unit sphere or sometimes the unit circle in V. In p 2 and p} these are the points that lie 1 unit away from the origin. 



EXAMPLE 5 Unusual Unit Circles in fl£ 



(a) Sketch the unit circle in an ^-coordinate system in p 2 using the Euclidean inner product {u,y) = u\vi I U2 v 2* 

(b) Sketch the unit circle in an ^-coordinate system in p 2 using the weighted Euclidean inner product 



Solution (a) 

If u = (x, y), then ||u|| = (u, u) 1/2 = \jx 2 -f y 2 , so the equation of the unit circle is y K 2 ±y 2 = 1, or, on squaring both sides, 

* 2 +7 2 =i 

As expected, the graph of this equation is a circle of radius 1 centered at the origin (Figure 6.1.1a). 



*- v 



£ 




{a) The utiii circle with 

the Ludidmn norm 

n — ^ 



lnl= i 




j$j The unit circle with 

the norm 



Hull ~ \'1 



l|u|| 
Figure 6-1-1 



V 4- - l H#r 



Solution (b) 

If u = (x ? 7), then ||u|| = (u, u) = 1/— x 2 I — ^ 2 , so the equation of the unit circle is 1/— x 2 I — y 2 = 1 , or, on squaring both 



sides, 



q ' 4 : 



The graph of this equation is the ellipse shown in Figure 6. 1. lb. 



It would be reasonable for you to feel uncomfortable with the results in the last example, because although our definitions of 
length and distance reduce to the standard definitions when applied to ^ 2 with the Euclidean inner product, it does require a stretch 
of the imagination to think of the unit "circle" as having an elliptical shape. However, even though nonstandard inner products 
distort familiar spaces and lead to strange values for lengths and distances, many of the basic theorems of Euclidean geometry 
continue to apply in these unusual spaces. For example, it is a basic fact in Euclidean geometry that the sum of the lengths of two 
sides of a triangle is at least as large as the length of the third side (Figure 6.1.2a). We shall see later that this familiar result holds 
in all inner product spaces, regardless of how unusual the inner product might be. As another example, recall the theorem from 
Euclidean geometry that states that the sum of the squares of the diagonals of a parallelogram is equal to the sum of the squares of 
the four sides (Figure 6.1.2b). This result also holds in all inner product spaces, regardless of the inner product (Exercise 20). 



li + \ 





(a) ||u -h v|| < l|ui + || v|S 
Figure 6.1.2 



(A) lu + vj 2 + Aii- yf = 2i ||iiB 2 + l|v|| 2 ) 



Inner Products Generated by Matrices 



The Euclidean inner product and the weighted Euclidean inner products are special cases of a general class of inner products on R n 
, which we shall now describe. Let 



u = 



«1 



and v = 



V2 



be vectors in R n (expressed as n x 1 matrices), and let A be an invertible nxn matrix. It can be shown (Exercise 30) that if u . v is 
the Euclidean inner product on Rj 1 , then the formula 



{vl ? y)=A\\-Av 
defines an inner product; it is called the inner product on R n generated by A. 



(3) 



Recalling that the Euclidean inner product u . y can be written as the matrix product ¥ ^ u [see 7 in Section 4.1], it follows that 3 
can be written in the alternative form 



or, equivalently, 



{u,v) = (Av) I Au 



{u,y) = y T A T Au 



(4) 



EXAMPLE 6 Inner Product Generated by the Identity Matrix 



The inner product on R n generated by the n x n identity matrix is the Euclidean inner product, since substituting A = / in 3 yields 

{u, v) =/u-/v = u-v 
The weighted Euclidean inner product { u? v) = 3u \ v \ I 2u2V2 discussed in Example 2 is the inner product on R 2 generated by 

^3 



,4 = 



^2 



because substituting this matrix in 4 yields 



^3 C 




fi 


"«r 


fe 


fe 


"2 


"3 0" 


""1 


- 


2_ 


U2 







{u,v)=[vi v 2 ] 

= [vi v 2 ] 

= 3u\ vi I 2^2 V2 
In general, the weighted Euclidean inner product 

(u, v)=vpiui vi I w 2 U2V2-\ \-w n u n v n 

is the inner product on R n generated by 

ifin o -■ 



A = 



\[w2~ ■■■ 
0- 



(5) 



(verify). 



In the following examples we shall describe some inner products on vector spaces other than R f 



EXAMPLE 7 An Inner Product on M27 



If 



U = 



u\ U2 






vi V2 




and 


V = 




U3 U4 






v 3 v 4 



are any two 2x2 matrices, then the following formula defines an inner product on M 22 (verify): 



{U, V) = tr(U T F) =tr(V T U) =u\ vi I u 2 v 2 I U3V3 I 114V4 



(Refer to Section 1.3 for the definition of the trace.) For example, if 

1 2 



U = 



3 4 



and V = 



-1 
3 2 



then 



{£/, f) = l(-l) + 2(0) + 3(3) I 4(2) = 16 
The norm of a matrix U relative to this inner product is 

\\U\\ = {U, U) XI2 = / u 2 + a 2 + a 2 j u 2 

and the unit sphere in this space consists of all 2 x 2 matrices U whose entries satisfy the equation || U\\ = 1, which on squaring 
yields 

A i A i A \ A 1 

Uj -\-u 2 I u 3 + w 4 = 1 



EXAMPLE 8 An Inner Product on p 2 



If 



2 2 



are any two vectors in p 2 , then the following formula defines an inner product on p 2 (verify): 
The norm of the polynomial/? relative to this inner product is 

and the unit sphere in this space consists of all polynomials p in p 2 whose coefficients satisfy the equation \\p\\ = 1, which on 
squaring yields 

A I A I A -| 

a^ -\- a^ -\- a 2 = I 



Calculus Required 



EXAMPLE 9 An Inner Product on C [a, b] 



Let f = f(x) and g = g(^) be two functions in C[a, b] and define 

■b 
f(x)g(x)dx (6) 



( f, e) =/ ! 



This is well-defined since the functions in C[a,b] are continuous. We shall show that this formula defines an inner product on 
C[a f b] by verifying the four inner product axioms for functions f = f(x), g = g(x), and i=s(x)inC[a,b]: 

which proves that Axiom 1 holds. 
2. r& 






C/W+gWW^ 



= / f(x)s(x)dx I / gO)£(>y* 

which proves that Axiom 2 holds. 

3. tb f b 

(*f.B}=/ kf(x)g(x)dx=k f(x)g(x)dx = k{f,g) 
Ja Ja 

which proves that Axiom 3 holds. 

If f = J (x) is any function in C [a, £ ] , then / (*) > for all x in [a, £ ] ; therefore, 

{fj)=f f 2 (x)dx>0 

Further, because f 2 (x) >0 and f = f(x) is continuous on [a,b], it follows that f%f 2 (x)dx = if and 
only if/O) = for all x in fa, Al - Therefore, we have {f,f}= / / (x)dx = if and only if f = 0- This 

Ja 

proves that Axiom 4 holds. 
Calculus Required 



EXAMPLE 1 Norm of a Vector in C [a, b] 



If C\a, b] has the inner product defined in the preceding example, then the norm of a function f — f (^) relative to this inner 
product is 



\\n = {U) m = ]JjHx)dx ( 7) 

and the unit sphere in this space consists of all functions /in C[a,b] that satisfy the equation ||f || = 1, which on squaring yields 

■b 



£ 



f 2 {x)dx = \ 



Calculus Required 



Remark Since polynomials are continuous functions on ( _ oo ? oo ), they are continuous on any closed interval [a, h] . Thus, 
for all such intervals the vector space p n is a subspace of C[a, b] , and Formula 6 defines an inner product on p n . 



Calculus Required 



Remark Recall from calculus that the arc length of a curve y = f (*) over an interval \a,b] is given by the formula 






(8) 



Do not confuse this concept of arc length with ||f ||, which is the length (norm) off when/ is viewed as a vector in C [a, ill- 
Formulas 7 and 8 are quite different. 

The following theorem lists some basic algebraic properties of inner products. 
THEOREM 6.1.1 



Properties of Inner Products 

Ifu, v, and w are vectors in a real inner product space, and k is any scalar, then 

(a) (0,v) = {v,0) = 

(b) (u, v + w) = (u, v) + (u, w) 

(c) (ii, kv) = k{u, v) 

(d) (u - v, w) = (ii, w) - (v, w) 

(e) {u,v-w) = {u,v)-{u,w} 



Proof We shall prove part (£) and leave the proofs of the remaining parts as exercises. 

f\\ 7 v I- w\ = fv + w 7 uj [By symmetry] 

= !\\ \\\ 4- /w, \\\ [By adtlitivity] 
= {u,v}H {u,w} [By syimnetiy] 



The following example illustrates how Theorem 6.1.1 and the defining properties of inner products can be used to perform 
algebraic computations with inner products. As you read through the example, you will find it instructive to justify the steps. 



EXAMPLE 11 Calculating with Inner Products 



(u - 2v, 3u f 4v) = (u, 3u I 4v) - (2v, 3u I 4v) 

= ill, 3u) + (u, 4v) - (2v, 3u) - (2v, 4v) 
= 3{u, u) 4- 4{u, v) - 6{v, u) - 8{v, v) 
= 3||u|| 2 4- 4{n, v) - 6{u, v) - 8||v|| 2 

= 3||u|| 2 -2{u,v}-8||v|| 2 

i 

Since Theorem 6.1.1 is a general result, it is guaranteed to hold for all real inner product spaces. This is the real power of the 
axiomatic development of vector spaces and inner products — a single theorem proves a multitude of results at once. For example, 
we are guaranteed without any further proof that the five properties given in Theorem 6.1.1 are true for the inner product on R n 
generated by any matrix A [Formula 3]. For example, let us check part (b) of Theorem 6.1.1 for this inner product: 

(u, v I w} = (v i u-) r J 4 r J 4u 

= (v +w )A An [Property of transpose] 

= ( v ^ A ^ An) + (w ^ A ^ An) [ Erop ei t> T °f itiatiix mnltiplic ation ] 

= (u, vj -h (n ? wj 
The reader will find it instructive to check the remaining parts of Theorem 6.1.1 for this inner product. 



Exercise Set 6.1 



Click here for Just Ask! 



1. 



Let (n, v) be the Euclidean inner product on ^ 2 , and let u = (3, — 2), v = (4, 5), w = ( — 1 , 6), and £ = _ 4. Verify that 

(a) (n,v) = {v,n) 

(b) (n I v ? w) = (n ? w) + (v ? w) 

(c) (n, v I w) = (u, v) + {u 7 w) 

(d) {tu., v) = k{u, v) = {u ? kv) 

(e) fO,v) = fY,0) = 



Repeat Exercise 1 for the weighted Euclidean inner product ( u , v) = 4u\ v\ 4- 5^2 v 2* 



Compute (u, v) using the inner product in Example 7. 



(a) 



u = 



~3 


-2" 


v = 


"-1 3~ 


4 


8_ 




1 1_ 



(b) 
u = 



1 2 


v = 


"4 6" 


_-3 5_ 




8_ 



4. 



Compute {p, q) using the inner product in Example 8. 



(a) p = _ 2 + x + 3x 2 , q = 4 - 7x ; 



(b) p = _ 5 + 2x + x 2 , q = 3 + 2x - 4x : 



5. 



(a) Use Formula 3 to show that { Ui v ) = 9u\ v\ I 4^2 ^2 * s me mner product on g? generated by 



A = 



3 
2 



(b) Use the inner product in part (a) to compute {u, v) if u = ( — 3, 2) and v = (1, 7)- 



6. 



(a) Use Formula 3 to show that { Ui v ) = 5u \ v \ — u \ V2 — U2 v i I 1 Oi^ ^2 * s me i nner product on j? 2 generated by 



A = 



2 1 
-1 3 



(b) Use the inner product in part (a) to compute {u, v) if u = (0, — 3) and v = (6, 2)- 



Let \\={u\, U2) and v = (v\, V2)- In each part, the given expression is an inner product on j? 2 . Find a matrix that generates it. 



(a) {u, v) = 2u\ v\ + 5u2 V2 



(b) {u, v) = 4u\ vi -I- 6^2 v 2 



Let u = (u\, U2) an d v = (vi, V2)« Show that the following are inner products on j? 2 by verifying that the inner product axioms 
8 - hold. 



(a) (u ? v) = 3u i v i 4- 5^2 v 2 



(b) (u,v)=4wivi I ^2 v l H- ^i V2 +4^2 v 2 



Let u = (u\ , U2, ui) an d v=(vi ? V2 ? V3)« Determine which of the following are inner products on p}. For those that are not, list 
'• the axioms that do not hold. 



(a) (u, v) = u\ vi +W3V3 



(b) v v 22^22^22 



(c) (u, v) = 2&ivi I W2 V 2+^3 V 3 



(d) (n ? v) = u\ vi — &2 v 2 "l~ u 3 v 3 



10. 



In each part, use the given inner product on R 2 to find ||w||, where w = ( — 1, 3). 



(a) the Euclidean inner product 



(b) the weighted Euclidean inner product ( u , v) = 3u 1 v 1 I 2u 2 v 2 > where u = ( M 1 r M2 ) and v = (v 1 , v 2 ) 



(c) the inner product generated by the matrix 



,4 = 



1 2 
1 3 



11. 



Use the inner products in Exercise 10 to find d(u, v) f° r u = ( — 1, 2) and v = (2, 5)- 



12. 



Let p 2 have the inner product in Example 8. In each part, find 



(a) p = _ 2 + 3* 4- 2*" 



(b) ,, = 4_3^ 2 



13. 



Let M22 h ave ^ e mner product in Example 7. In each part, find ||j4||. 



& A = 



2 5 

3 6 



(b) 



A = 








14. 



Let p 2 have the inner product in Example 8. Find ^(p_ q). 



15. 



Let M22 b- ave the inner product in Example 7. Find d{A, B)- 



(a) 



A = 



™A = 



2 6" 
9 4 


? 


5 = 




■4 T 
1 6 




-2 4" 
1 0_ 


. 5 = 


"-5 r 

6 2_ 



16. 



Suppose that u, v, and w are vectors such that 

(u, v) = 2, fv, iv) = - 3, fn, w) = 5, Hull = L llvll = 2, IHI=7 

Evaluate the given expression. 



(a) (u 4- v, v + w) 

(b) {2v-w, 3u I 2w) 

(c) (u-v-2w, 4u I v) 

(d) Hu + v|| 

(e) l|2w-v|| 

(f) ||u-2v + 4w|| 



17. (For Readers Who Have Studied Calculus) 

Let the vector space p 2 have the inner product 



(p. *)-/_; 



p{x)q{x)dx 



(a) Find ||i>|| for p = 1, p = x, p = x : 



(b) Find ^(p, q ) if p = 1 and q = *. 



18. 



Sketch the unit circle in gl using the given inner product. 



(u, v) = -uivi I ^-"2^2 



(b) {u, v) = 2uivi I U2V2 



19. 



Find a weighted Euclidean inner product on p} for which the unit circle is the ellipse shown in the accompanying figure. 




Figure Ex-19 



20. 



Show that the following identity holds for vectors in any inner product space. 

||u + v|| 2 +||u-v|| 2 = 2||u|| 2 + 2||v|| 2 



21. 



Show that the following identity holds for vectors in any inner product space. 

(u,v} = l||u I v|| 2 -l||u-v|| 2 



22. 



Let £7 = 



u\ u 2 


and^ = 


vi v 2 


u% U4 




v 3 v 4 



. Show that {U, V) = u\ vi I u 2 v^ I u^v 2 I ^4 V4 is not an inner product on Jif 22 . 



23. 



Let p = p{x) and q = q{ x ) be polynomials in p^ Show that 

(V,<l}=p(0)q(0) I p{^(^) I p(V )q (Vj 
is an inner product on p 2 . Is this an inner product on p 3 ? Explain. 



24. 



Prove: If {u, v) is the Euclidean inner product on R n , and if A is an n x n matrix, then 

{\\,Ay) = {A T \\,y) 



Hint Use the fact that (u, v) = u - v = v u. 



25. 



Verify the result in Exercise 24 for the Euclidean inner product on ^ 3 and 



u = 



~-l~ 




2 




2 


v = 





, A = 


4 




-2 





1 




2 


r 


3 


4 







5 




1 


2 



26. 



Let u = (ui,U2,-..,u n ) and v = (vi, v 2 , ..., v H )- Show that 

(u,v) = vi?iwivi I u^w^v^ I »■ I W M W M V M 
is an inner product on ^ M if w\, w 2 , . . ., w H are positive real numbers. 



27. (For Readers Who Have Studied Calculus) 

Use the inner product 

(p. q}= / p(x)q(x)dx 

to compute (p, q), for the vectors p = p{ x ) and q = ^(x) in f^. 



2 , ^3 



(a) p = 1 - * + x A 4. 5* J ? q = x _ 3* 



(b) p = *-5* 3 , q = 2 I 8* 2 

28. (For Readers Who Have Studied Calculus) 

In each part, use the inner product 

to compute (f , g), for the vectors f = f (*) and g = g(^) in C[0, 1 ]. 

(a) f = cos 2kx, g = sin2ir;r 

(b) f = x, g = e* 



(c) f = tannic, g=l 



29. 



Show that the inner product in Example 7 can be written as { U, V) = tr( U V). 



30. 



Prove that Formula 3 defines an inner product on R n . 
Hint Use the alternative version of Formula 3 given by 4. 



31. 



Show that matrix 5 generates the weighted Euclidean inner product ( u? v ) = w \ u \ v \ I >t>2 w 2 v 2 H I- ™h w h v h* 



Discussion 
Discovery 



The following is a proof of part (c) of Theorem 6.1.1. Fill in each blank line with the name of an 
32. inner product axiom that justifies the step. 

Hypothesis: Let u and v be vectors in a real inner product space. 

Conclusion: {u, kv) = k{\\, v). 
Proof: 



1. (ii, kv) = {kv, u) . 



2. =k{v,u). 

3. =£{u,v). 



Prove parts (a), (d ), and (e) of Theorem 6.1.1, justifying each step with the name of a vector space 

33- axiom or by referring to previously established results. 

Create a weighted Euclidean inner product ( u? v ) = au \ v \ I bu2 V2 on R 2 f° r which the unit circle 

34- in an ^-coordinate system is the ellipse shown in the accompanying figure. 




Figure Ex-34 



Generalize the result of Problem 34 for an ellipse with semimajor axis a and semiminor axis b, with 
35. a and b positive. 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



6.2 

ANGLE AND 
ORTHOGONALITY IN 
INNER PRODUCT 
SPACES 



In this section we shall define the notion of an angle between two vectors in 
an inner product space, and we shall use this concept to obtain some basic 
relations between vectors in an inner product, including a fundamental 
geometric relationship between the nullspace and column space of a matrix. 



Cauchy-Schwarz Inequality 

Recall from Formula 1 of Section 3.3 that if u and v are nonzero vectors in p} or p} and is the angle between them, then 

n- v= ||u||||v||cos0 



or, alternatively, 



cos 9 = 



(1) 



u - V 



u v 



(2) 



Our first goal in this section is to define the concept of an angle between two vectors in a general inner product space. For such 
a definition to be reasonable, we would want it to be consistent with Formula 2 when it is applied to the special case of p} and 
pi with the Euclidean inner product. Thus we will want our definition of the angle Q between two nonzero vectors in an inner 
product space to satisfy the relationship 



cos0 = J^A 



u V 



(3) 



However, because |cos 9\ < 1, there would be no hope of satisfying 3 unless we were assured that every pair of nonzero vectors 
in an inner product space satisfies the inequality 

jivvj. 



u V 



<1 



Fortunately, we will be able to prove that this is the case by using the following generalization of the Cauchy-Schwarz 
inequality (see Theorem 4.1.3). 



THEOREM 6.2.1 



Cauchy-Schwarz Inequality 

Ifu and v are vectors in a real inner product space, then 



|{u,v}|<||u||||v|| 



(4) 



Proof We warn the reader in advance that the proof presented here depends on a clever trick that is not easy to motivate. If 
u = 0. then (u, v) = {u, u) = 0, so the two sides of 4 are equal. Assume now that u ^ 0- Let a = (u, n), b = 2{u, v) and 
c = {v, v), and let t be any real number. By the positivity axiom, the inner product of any vector with itself is always 



nonnegative. Therefore, 

< [t u 4- v, £ u 4- v) = (u, w}£ 2 + 2(u, v)£ + (v, v} 

This inequality implies that the quadratic polynomial a t 2 I bt I c has either no real roots or a repeated real root. Therefore, its 
discriminant must satisfy the inequality b 2 — 4ac < 0- Expressing the coefficients a, b, and c in terms of the vectors u and v 
gives 44 (u, vj — 4(u, uj(v, vj < 0, or, equivalently, 

(u, v} 2 < (u, u}(v ? v} 



Taking square roots of both sides and using the fact that (u, \\) and (v, v) are nonnegative yields 
which completes the proof. 



|(u, v}| < (u, u} 1/2 (v, v} 1/2 or, equivalently, |(u, v}| < ||u||||v| 



For reference, we note that the Cauchy-Schwarz inequality can be written in the following two alternative forms: 

{u,y} 2 <(u,u}(y,v} (5) 

(u,v} 2 <||u|| 2 ||v|| 2 (6) 

The first of these formulas was obtained in the proof of Theorem 6.2.1, and the second is derived from the first using the fact 
that ||u|| 2 = (u, u} and ||v|| 2 = (v, v}. 



EXAMPLE 1 Cauchy-Schwarz Inequality in R n 



The Cauchy-Schwarz inequality for R n (Theorem 4.1.3) follows as a special case of Theorem 6.2.1 by taking (u, vj to be the 

Euclidean inner product u . y. 

♦ 

The next two theorems show that the basic properties of length and distance that were established in Theorems 4.1.4 and 4.1.5 
for vectors in Euclidean rc-space continue to hold in general inner product spaces. This is strong evidence that our definitions of 
inner product, length, and distance are well chosen. 

THEOREM 6.2.2 



Properties of Length 

Ifu and v are vectors in an inner product space V, and ifk is any scalar, then 

(a) ||u||>0 

Qy\ ||u|| = if and only ifn = 



(c) 
(d) 










||*u|| 


= |*|IMI 






II" + 


v|| < ||u|| 


1 l|v|| 


(Triangle inequality) 











THEOREM 6.2.3 



Properties of Distance 

Ifu, v, and w are vectors in an inner product space V, and ifk is any scalar, then 

(a) d(u,v)>0 

(b) d (u, v) = if and only if u = v 

(c) d(u, v) =d(v,\i) 



(d) d (u, v) < d (u, w) I d(w,v) (Tnangle inequality) 



We shall prove part (d) of Theorem 6.2.2 and leave the remaining parts of Theorems Theorem 6.2.2 and Theorem 6.2.3 as 
exercises. 



Proof of Theorem 6.2.2d By definition, 



||U + Y|| ={U + V,U + Y} 

= (u, u} 4- 2{u, v} 4- (v, v} 
<{u ? u} + 2|(u ? v}| f(v ? v} 
<(u p u} + 2||n||||Y|| I (y,y} 
= ||n|| 2 + 2||u||||Y|| + ||Y|| 2 
= CI|u|| + ||v||) 2 
Taking square roots gives ||u + v|| < ||u|| + ||v||, 



[ft'opeity of absolute value] 
[By (4)] 



Angle Between Vectors 

We shall now show how the Cauchy-Schwarz inequality can be used to define angles in general inner product spaces. Suppose 
that u and v are nonzero vectors in an inner product space V. If we divide both sides of Formula 6 by ||u|| 2 1| v|| 2 , we obtain 

2 

<1 



Jvvv! 
Ilullllvll 



or, equivalently, 



-1< 



llullllvll 



<1 



(7) 



Now if 9 is an angle whose radian measure varies from to jt, then cos assumes every value between -1 and 1 inclusive 
exactly once (Figure 6.2.1). 




Figure 6.2.1 

Thus, from 7, there is a unique angle such that 

cos 9 = rfePn ^d 0<9<n 
IMIIMI 

We define to be the angle between u and v. Observe that in r 2 or p} with the Euclidean inner product, 8 agrees with the 
usual formula for the cosine of the angle between two nonzero vectors [Formula 2]. 



(8) 



EXAMPLE 2 Cosine of an Angle Between Two Vectors in p 4 



Let j? 4 have the Euclidean inner product. Find the cosine of the angle Q between the vectors u = (4, 3, 1 , — 2) and 

v=(-2, 1,2,3). 



Solution 



We leave it for the reader to verify that 



so that 



|u|| = ^30, ||v|| = y/T8, and (u, v)= -9 



NNMI /30/l8 2/15 



Orthogonality 

Example 2 is primarily a mathematical exercise, for there is relatively little need to find angles between vectors, except in gl 
and j? 3 with the Euclidean inner product. However, a problem of major importance in all inner product spaces is to determine 
whether two vectors are orthogonal — that is, whether the angle between them is Q = ^ / 2- 

It follows from 8 that if u and v are nonzero vectors in an inner product space and is the angle between them, then CO s 9 = if 
and only if fix, vj = 0. Equivalently, for nonzero vectors we have 9 = ^ f 2 if and only if fix, vj = 0. If we agree to consider the 
angle between u and v to be ^ / 2 when either or both of these vectors is 0, then we can state without exception that the angle 
between u and v is 77 / 2 if and only if fxx, vj = 0. This suggests the following definition. 



DEFINITION 



Two vectors u and v in an inner product space are called orthogonal if (u, vj = 0. 



Observe that in the special case where (u, v} = u - v is the Euclidean inner product on R n , this definition reduces to the 
definition of orthogonality in Euclidean rc-space given in Section 4.1. We also emphasize that orthogonality depends on the 
inner product; two vectors can be orthogonal with respect to one inner product but not another. 



EXAMPLE 3 Orthogonal Vectors in M 2 2 



If ^22 h as ^ e i nner product of Example 7 in the preceding section, then the matrices 



U = 



1 
1 1 



and V = 



2 




are orthogonal, since 



!U,V) = l(Q) I 0(2) I 1(0) I 1(0) = 



Calculus Required 



EXAMPLE 4 Orthogonal Vectors in p 2 



Let p 2 have the inner product 



and let p = x and q = x . Then 



<P>q>=/ . 


P{x)q{ 


x)dx 






IIpII = (P,P} 1/2 = 


' f 1 

I xx dx 

J-\ 


1/2 


f x 2 dx 


1/2 


llq|| = {q,q} 1/2 = 


j x 2 x 2 dx 


1/2 


j 1 x A dx 


1/2 


(p. q}= / xx dx 


= j x 3 dx 
















Because fp ? q^ = , the vectors p = x and q = x are orthogonal relative to the given inner product. 

In Section 4.1 we proved the Theorem of Pythagoras for vectors in Euclidean rc-space. The following theorem extends this 
result to vectors in any inner product space. 



THEOREM 6.2.4 



Generalized Theorem of Pythagoras 

Ifu and v are orthogonal vectors in an inner product space, then 



ii i ii J> M ii J> i M ii J> 
u+ v =u + v 



Proof The orthogonality of u and v implies that (u, vj = 0, so 



|u + v|| 2 = (u + v, u + v} = ||u|| 2 + 2(u, v} + || v|| 2 
= l|u|| 2 +||v|| 2 



Calculus Required 



EXAMPLE 5 Theorem of Pythagoras in p 2 



In Example 4 we showed that p = x and q = x 2 are orthogonal relative to the inner product 

p{x)q{x)dx 
' J-l 

on p 2 . It follows from the Theorem of Pythagoras that 



(p. ■»=/_' 



ii i ii 2 ii ii 2. . ii ii 2. 

Ilp + qll =IIpII + llqll 

Thus, from the computations in Example 4, we have 

We can check this result by direct integration: 

IIP + q|| 2 = {P I q-P I- q}= / {xl x 2 )(x + x 2 )dx 

= f x 2 dx + 2J x 3 dx+f x 4 dx = | + + | = j| 

Orthogonal Complements 

If V is a plane through the origin of gl with the Euclidean inner product, then the set of all vectors that are orthogonal to every 
vector in V forms the line L through the origin that is perpendicular to V (Figure 6.2.2). In the language of linear algebra we say 
that the line and the plane are orthogonal complements of one another. The following definition extends this concept to general 
inner product spaces. 



Figure 6.2.2 




Every vector in L is orthogonal to every vector in V. 



DEFINITION 



Let Wbe a subspace of an inner product space V. A vector u in V is said to be orthogonal to W if it is orthogonal to every 
vector in W, and the set of all vectors in V that are orthogonal to Wis called the orthogonal complement ofW. 



Recall from geometry that the symbol | is used to indicate perpendicularity. In linear algebra the orthogonal complement of a 
subspace Wis denoted by W ' . (read "Wperp"). The following theorem lists the basic properties of orthogonal complements. 



THEOREM 6.2.5 



Properties of Orthogonal Complements 

If W is a subspace of a finite-dimensional inner product space V, then 

(a) W J is a subspace of V. 

(b) The only vector common to W and W ' is 0. 

(c) The orthogonal complement ofW ' is W; that is, (W ' ) = W. 



We shall prove parts (a) and (b). The proof of (c) requires results covered later in this chapter, so its proof is left for the 
exercises at the end of the chapter. 



Proof (a) Note first that (0, wJ = for every vector w in W, so W ' contains at least the zero vector. We want to show that 
W ' is closed under addition and scalar multiplication; that is, we want to show that the sum of two vectors in ^ ' is 
orthogonal to every vector in W and that any scalar multiple of a vector in f^ ' is orthogonal to every vector in W. Let u and v 
be any vectors in W ' , let k be any scalar, and let w be any vector in W. Then, from the definition of W ' , we have (u, wJ = 
and ( v 7 wJ = 0. Using basic properties of the inner product, we have 



(u + v, wj = (u, wj + ( v, wj = + = 
which proves that u I v and ku. are in w ± . 



Proof (b) If v is common to W and W " L 9 then ( v, vj = 0, which implies that y — by Axiom 4 for inner products. 



Remark Because W and ffl ' are orthogonal complements of one another by part (c) of the preceding theorem, we shall say 
that Wand ffl ' are orthogonal complements. 

A Geometric Link between Nullspace and Row Space 

The following fundamental theorem provides a geometric link between the nullspace and row space of a matrix. 
THEOREM 6.2.6 



If A is an m x n matrix, then 








(a) 


The nullspace of A and the row space 
product. 


ofA 


are orthogonal complements in R n with 


respect to the Euclidean inner 


(b) 


The nullspace of J)J and the column . 
inner product. 


space 


of A are orthogonal complements in R 17 


with respect to the Euclidean 



Proof (a) We want to show that the orthogonal complement of the row space of A is the nullspace of A. To do this, we must 
show that if a vector v is orthogonal to every vector in the row space, then ^y — 0, and conversely, that if Av = 0. then v is 
orthogonal to every vector in the row space. 

Assume first that v is orthogonal to every vector in the row space of A. Then in particular, v is orthogonal to the row vectors n, 
r 2 , ..., r m of A; that is, 



ri-Y = r2-v = ™ = r w -Y=0 

But by Formula 1 1 of Section 4.1, the linear system jjx = can be expressed in dot product notation as 



" n ■ x " 




"o" 


r 2 -x 


= 





r m -x 








(9) 



(10) 



so it follows from 9 that v is a solution of this system and hence lies in the nullspace of A. 

Conversely, assume that v is a vector in the nullspace of A, so Aw — 0- It follows from 10 that 

n - v = i'2 - v = - = r m - v = 
But if r is any vector in the row space of A, then r is expressible as a linear combination of the row vectors of A, say 



r = cii'l +C2i'2H hc ra r 



m 



Thus 



r - v = (ci ri + c 2 1"2 + - + ^m r ffl ) - v 

= c i (ri - v) + c 2 (i'2 - v) H h c m (r m - v) 

=0+0+ -+0=0 
which proves that v is orthogonal to every vector in the row space of A. 



Proof (b) Since the column space of A is the row space of j[ T (except for a difference in notation), the proof follows by 
applying the result in part (a) to A T - 



The following example shows how Theorem 6.2.6 can be used to find a basis for the orthogonal complement of a subspace of 
Euclidean n-space. 



EXAMPLE 6 Basis for an Orthogonal Complement 



Let Wbe the subspace of R 5 spanned by the vectors 

wi = (2, 2, - 1, 0, 1), w 2 = ( - 1, - 1, 2 - 3, 1), 
w 3 = (l,l, -2,0, -1), w 4 = (0,0, 1,1,1) 

Find a basis for the orthogonal complement of W. 

Solution 

The space W spanned by wj, w 2 , w 3 , and W4 is the same as the row space of the matrix 

2 2-10 1 



A = 



-1 -12-3 1 
1 1-2 0-1 
111 



and by part (a) of Theorem 6.2.6, the nullspace of A is the orthogonal complement of A. In Example 4 of Section 5.5 we 
showed that 



vi = 



-1 

1 






and V2 = 



~-l~ 





-1 





1 



form a basis for this nullspace. Expressing these vectors in the same notation as wi, W2, W3, and W4, we conclude that the 
vectors 

vi =f- 1,1,0,0,0) and y ? = f - 1, 0, - 1, 0, V) 

form a basis for the orthogonal complement of W. As a check, the reader may want to verify that vi and V2 are orthogonal to w\ 
, W2, W3, and W4 by calculating the necessary dot products. 



Summary 

We leave it for the reader to show that in any inner product space V, the zero space {0} and the entire space V are orthogonal 



complements. Thus, if A is an n x n matrix, to say that Ay = has only the trivial solution is equivalent to saying that the 
orthogonal complement of the nullspace of A is all of j? M , or, equivalently, that the rowspace of A is all of g^. This enables us 
to add two new results to the seventeen listed in Theorem 5.6.9. 

THEOREM 6.2.7 



Equivalent Statements 

If A is an nxtt matrix, and ifTj±. R n * R n is multiplication by A, then the following are equivalent. 

(a) A is invertible. 

(b) Ax = has only the trivial solution. 

(c) The reduced row-echelon form of A is / . 

(d) A is expressible as a product of elementary matrices. 

( e ) Ax = ti is consistent for every n x 1 matrix b. 

(f) Ax = ti has exactly one solution for every # x 1 matrix b. 

(g) det(,4)*0. 

(h) The range of 7^ is R n . 

(i) Xa is one-to-one. 

(j) The column vectors of A are linearly independent. 

(k) The row vectors of A are linearly independent. 

(1) The column vectors of A span R n . 

(m) The row vectors of A span R n . 

(n) The column vectors of A form a basis for R n . 



(o) 


The row vectors of A form a 


basis for R n . 






(p) 


A has rank n. 








(q) 


A has nullity 0. 








(r) 


The orthogonal complement of the nullspace of A i 


'sR". 


(s) 


The orthogonal complement of the row space 


ofA 


is {0}. 



This theorem relates all of the major topics we have studied thus far. 



Exercise Set 6.2 



Click here for Just Ask! 



In each part, determine whether the given vectors are orthogonal with respect to the Euclidean inner product. 
1. 

(a) u=(-l,3,2),v=(4,2, -1) 

(b) u=(-2, -2, -2),v=(l, 1, 1) 

(c) u=(u h u 2 ,u 3 ),v= (0,0,0) 

(d) u=(-4,6, -10 p 1),v=(2,1, -2,9) 

(e) u=(0, 3, -2, l),v=(5, 2, -1,0) 

(f) n=(a,b),v=(-b,a) 



Do there exist scalars k, I such that the vectors u = (2, k, 6), v = (I, 5, 3)> and w = (1, 2, 3) are mutually orthogonal with 
^' respect to the Euclidean inner product? 

Let _£ 3 have the Euclidean inner product. Let u = (1 1 — 1) and v = (6 7 — 15)- If ||£u 4- v|| = 13, what is kl 
3. 



Let R 4 have the Euclidean inner product, and let u = ( _ 1 ; 1,0, 2)- Determine whether the vector u is orthogonal to the 
subspace spanned by the vectors Wl = (0, 0, 0, 0), w 2 = (1, - 1, 3, 0) , and W3 = (4, 0, 9, 2) 



5. 



Let R 2 , R 3 , and R 4 have the Euclidean inner product. In each part, find the cosine of the angle between u and v. 



(a) „=(l, -3),v=(2,4) 



(b) u= (_i j0 ),v=(3, 8) 



(c) u=(-l,5,2),v=(2,4, -9) 



(d) u =(4, l,8),v=(l,0, -3) 



(e) u= (1.0. l,0),v=(-3, -3, -3, -3) 



(f) u= (2, 1,7. -l),v=(4,0,0,0) 



Let ^2 nave the inner product in Example 8 of Section 6.1. Find the cosine of the angle between/; and q. 



(a) p = _ 1 | 5* + 2x 2 , q = 2 + 4* - 9x : 



(b) p = * _ x 2 , q = 7 I 3x + 3x : 



Show that p = 1 — x I 2x and q = 2x I x are orthogonal with respect to the inner product in Exercise 6. 



Let Ji/22 h ave me i nner product in Example 7 of Section 6.1. Find the cosine of the angle between A and B. 



^ A = 



2 6 
1 -3 



B = 



3 2 
1 



(b) 



A = 



2 4 
-1 3 



,B = 



-3 1 

4 2 



Let 



^4 = 



2 1 
-1 3 



Which of the following matrices are orthogonal to A with respect to the inner product in Exercise 8? 



(a) 



-3 
2 



(b) 



1 1 
-1 



(c) 








(d) 



2 1 
5 2 



10. 



Let p} have the Euclidean inner product. For which values of k are u and v orthogonal? 



(a) u=(2, l,3),v=(l,7,£) 



(b) n=(k,k,\),v=(k,5,6) 



Let ^4 have the Euclidean inner product. Find two unit vectors that are orthogonal to the three vectors u = (2, 1, — 4,0), 
U " v=(-l, -1,2, 2), and*- =(3, 2, 5, 4). 



12. 



In each part, verify that the Cauchy-Schwarz inequality holds for the given vectors using the Euclidean inner product. 



(a) u=(3,2),v=(4, -1) 



(b) u=(-3, l,0),v=(2, -1,3) 



(c) u=(-4, 2, l),v=(8, -4, -2) 



(d) „=(0, -2,2, l),v=(-l, -1,1,1) 



13. 



In each part, verify that the Cauchy-Schwarz inequality holds for the given vectors. 



(a) u = ( — 2, 1 ) and v = ( 1 , 0) using the inner product of Example 2 of Section 6. 1 



(b) 



U = 



-1 2 
6 1 



and F = 



1 
3 3 



using the inner product in Example 7 of Section 6. 1 



(c) p = _ 1 + 2x 4 x 2 and q = 2 — 4x 2 using the inner product given in Example 8 of Section 6.1 



Let Wbt the line in J? 2 with equation y = 2x- Find an equation for W 
14. 



15. 

(a) Let Wbt the plane in ^ 3 with equation x — 2y — 3z = 0- Find parametric equations for IV ± - 



(b) Let Wbe the line in p 3 with parametric equations 

x = 2t r y=—5t, z = 4t ( — do < t < co ) 
Find an equation for if ' . 

(c) Let Wbt the intersection of the two planes 

*+.y+z = and x— y I z = 
in r 3 . Find an equation for w ' - 



Let 
16. 



,4 = 



12-12 
3 5 4 
112 



(a) Find bases for the row space and nullspace of A. 

(b) Verify that every vector in the row space is orthogonal to every vector in the nullspace (as guaranteed by Theorem 
6.2.6a). 



Let A be the matrix in Exercise 16. 
17. 



(a) Find bases for the column space of A and nullspace of j{ T - 



(b) Verify that every vector in the column space of A is orthogonal to every vector in the nullspace of j^T (as 
guaranteed by Theorem 6.2.6b). 



Find a basis for the orthogonal complement of the subspace of R n spanned by the vectors. 
18. 



(a) vi = (1, -l,3),v 2 = (5, -4, -4),v 3 = (7, -6,2) 



19. 



(b) vi = (2,0, -l),v 2 = (4,0, -2) 

(c) vi = (1, 4, 5, 2), v 2 = (2, 1, 3, 0), v 3 = ( - 1, 3, 2, 2) 

(d) vi = (l,4, 5, 6, 9>v 2 =(3, -2, 1,4, -l>v 3 = (-l,0, -1, -2, - 1), v 4 = (2, 3, 5, 7, 8) 
Let V be an inner product space. Show that if u and v are orthogonal unit vectors in V, then ||u — v|| = ^2. 



Let V be an inner product space. Show that if w is orthogonal to both ui and 112, it is orthogonal to k\ \\\ I £2 112 f° r all 
20- scalars k\ and £ 2 - Interpret this result geometrically in the case where V is g 3 with the Euclidean inner product. 

Let V be an inner product space. Show that if w is orthogonal to each of the vectors ui, 112, . . ,u r , then it is orthogonal to 
21. every vector in span ( Ul? U2? ___, u,} • 

Let {vi, ¥2, ---, VjO be a basis for an inner product space V. Show that the zero vector is the only vector in V that is 
22- orthogonal to all of the basis vectors. 

Let ( Wl? \\-2 ? wfr) be a basis for a subspace Wof V. Show that IV ' consists of all vectors in V that are orthogonal to 

23. ever y basis vector. 

Prove the following generalization of Theorem 6.2.4. If vi, V2> • • ., v? are pairwise orthogonal vectors in an inner product 

24. space V, then 

l|vi+v 2 + .-v,|| 2 = ||vi|| 2 + ||v 2 || 2 + ™+||v,|| 2 

Prove the following parts of Theorem 6.2.2: 
25. 

(a) part (a) 

(b) part(fc) 

(c) part (c) 

Prove the following parts of Theorem 6.2.3: 

(a) part (a) 

(b) part(fc) 

(c) part (c) 



26. 



(d) part (d) 



27. 



Prove: If u and v are n x 1 matrices and A is an n x n matrix, then 



J aT 



T A T - w 7 a T 



(v J A' Aii) < (xi J A' An)(v 2 A' Av) 



28. 



Use the Cauchy-Schwarz inequality to prove that for all real values of a, b, and 0, 

(a cos 9 + b sinfl) 2 < a 2 + b 2 



Prove: If w \, u>2> ■ • • , w n are positive real numbers and if u = (u 1 , U2, - - -, u H ) an d v = (v 1 , V2 ? . . ., v„) are an Y two vectors in 
29 « £ H ,then 



|H?1 U\ VI +>V2^2V2+ h^H^M^Hl 



1/2 



< (n?i Uj +>^2^2 "! ^h u h) (>^1 Vi I >^2 V 2 ~! I" w h v h) 



1/2 



30. 



Show that equality holds in the Cauchy-Schwarz inequality if and only if u and v are linearly dependent. 



Use vector methods to prove that a triangle that is inscribed in a circle so that it has a diameter for a side must be a right 
31. triangle. 



0_^— 




Figure Ex-31 
Hint Express the vectors j±g and gQ in the accompanying figure in terms of u and v. 

With respect to the Euclidean inner product, the vectors u = ( 1 , J3) and v = ( — 1 , J3) have norm 2, and the angle 
32- between them is 60°. (see the accompanying figure). Find a weighted Euclidean inner product with respect to which u and 
v are orthogonal unit vectors. 



t-lM) 




Figure Ex-32 



33. (For Readers Who Have Studied Calculus) 

Let f (x) and g( K ) be continuous functions on [0, 1]. Prove: 



(a) 



■ 1 ~i r ■ 1 ~i r ■■ 1 

I f(x)g(x)dx < f f 2 {x)dx I g 2 (x)dx 
JO JO JO 



(b) 



./o 



[f(x) I g(x)]"dx 



1/2 



< 



Jo 



f Z (x)dx 



1/2 



/ g 2 (*)dx 
Jo 



1/2 



Hint Use the Cauchy-Schwarz inequality. 

34. (For Readers Who Have Studied Calculus) 

Let C\0, w] have the inner product 

{f,E}=£f(x)g(x)dx 

and let f M = C os kx (m = 0, 1,2,...)- Show that if £ ^ /, then f ^ and f j are orthogonal with respect to the given inner 
product. 



Discussion 

Discovery 



(a) Let Wbt the line y = x in an ^-coordinate system in J? 2 . Describe the subspace W 



(b) Let Wbe the y-axis in an ^y^-coordinate system in p\ Describe the subspace W ' • 



(c) Let Wbe the yz-plane of an *yz-coordinate system in ^ 3 . Describe the subspace ffl 



I 



36. 



Let Ax = be a homogeneous system of three equations in the unknowns x, y, and z. 

(a) If the solution space is a line through the origin in j? 3 , what kind of geometric object is 
the row space of A? Explain your reasoning. 



(b) If the column space of A is a line through the origin, what kind of geometric object is the 
solution space of the homogeneous system A T x = 0? Explain your reasoning. 



(c) If the homogeneous system A T x = has a unique solution, what can you say about the 
row space and column space of A? Explain your reasoning. 



Indicate whether each statement is always true or sometimes false. Justify your answer by giving 
37. a logical argument or a counterexample. 



(a) If V is a subspace of R n and W is a subspace of V , then ffl 1 - is a subspace of Jf -k 

(b) ||u + v 4- w|| < ||u|| 4- ||v|| 4- ||w|| for all vectors w, v, and w in an inner product space. 

(c) If u is in the row space and the nullspace of a square matrix A, then u — 0. 

(d) If u is in the row space and the column space of an n x n matrix A, then u — Q. 



T T 

Let ^22 h ave ^ e i nner product ( U, Vj = tr ( U V) =tr(V U) that was defined in Example 7 



38. 

of Section 6.1. Describe the orthogonal complement of 



(a) the subspace of all diagonal matrices 

(b) the subspace of symmetric matrices 
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O-O in many problems involving vector spaces, the problem solver is free to 

ORTHONORM AL BASES; choose any basis for the vector space that seems appropriate. In inner 

G RA M - S C H M I DT product spaces, the solution of a problem is often greatly simplified by 

n choosing a basis in which the vectors are orthogonal to one another. In this 

rKULtbb, section we shall show how such bases can be obtained. 
QK-DECOMPOSITION 



DEFINITION 



A set of vectors in an inner product space is called an orthogonal set if all pairs of distinct vectors in the set are 
orthogonal. An orthogonal set in which each vector has norm 1 is called orthonormal. 



EXAMPLE 1 An Orthogonal Set in R ■ 



Let 

ill = (0,1,0), u 2 = (1,0,1), 113 =(1,0,-1) 
and assume that g} has the Euclidean inner product. It follows that the set of vectors S= (ui, 112, 113} is orthogonal since 

(ui, u 2 } = (ui, 113} = (u 2 , 113} = 0. 



If v is a nonzero vector in an inner product space, then by part (c) of Theorem 6.2.2, the vector 

1 



-v 



has norm 1, since 



1 



v 



-v 



1 



V 



V = 



1 



l|v|| 



Ml = 1 



The process of multiplying a nonzero vector v by the reciprocal of its length to obtain a unit vector is called normalizing v. 
An orthogonal set of nonzero vectors can always be converted to an orthonormal set by normalizing each of its vectors. 



EXAMPLE 2 Constructing an Orthonormal Set 



The Euclidean norms of the vectors in Example 1 are 

||ni|| = l, ||m|| = i/2, Hii3ll = i/2 
Consequently, normalizing m, n 2 , and 113 yields 



vi = t\ = (0,1,0), v 2 = -^V = +.0.-H- 
ll«lll l|n 2 || [fe ft] 

We leave it for you to verify that the set s= { V \, v 2? V3} is orthonormal by showing that 

(vi ? v 2 } = (vi, v 3 } = (v 2 , v 3 } = and ||vi || = ||v 2 || = ||v 3 || = 1 

In an inner product space, a basis consisting of orthonormal vectors is called an orthonormal basis, and a basis consisting of 
orthogonal vectors is called an orthogonal basis. A familiar example of an orthonormal basis is the standard basis for p 3 
with the Euclidean inner product: 

i= (1,0,0), j=(Q, 1,0), k=(Q, 0, 1) 

This is the basis that is associated with rectangular coordinate systems (see Figure 5.4.4). More generally, in R n with the 
Euclidean inner product, the standard basis 

ei = (1, 0, 0, ..., 0), e 2 = (0, 1, 0, ..., 0), ..., e H = (0, 0, 0, ..., 1) 
is orthonormal. 

Coordinates Relative to Orthonormal Bases 

The interest in finding orthonormal bases for inner product spaces is motivated in part by the following theorem, which 
shows that it is exceptionally simple to express a vector in terms of an orthonormal basis. 

THEOREM 6.3.1 



IfS= {vi, v 2 , ---, v H } is an orthonormal basis for an inner product space V, and u is any vector in V, then 

u = {n, viJvi + (ii, v 2 )v 2 + »■ + {u, v„) v„ 



Proof Since S = { vi , v 2 , . . ., v H } is a basis, a vector u can be expressed in the form 

u = k\ vi +^ 2 v 2 + ... + ^ H v H 
We shall complete the proof by showing that £ 3 = (u, v 3 ■} for i = 1, 2, ..., n. For each vector v 3 in S, we 
have 

{u,V|} = {jfcivi I A: 2 v 2 + - + A: M v H ,v 3 ) 

= *l{vi,Vj} I jt2{v2,Vj} + -» + jt M (v M ,Vj} 

Since s= {vi, v 2 ,..., v H } is an orthonormal set, we have 

(*U v i)= ll v iH 2 = 1 md ( v j' v i} = ]£ J qti 

Therefore, the above expression for (u, v 3 } simplifies to 

(u,v 3 } = £ 3 

I 

Using the terminology and notation introduced in Section 5.4, the scalars 

(u,vi), (11, v 2 },..., (u, V H } 



in Theorem 6.3.1 are the coordinates of the vector u relative to the orthonormal basis S = {v\, ¥2, .--, v H } , and 

(11) S= C(u- vi }, (u, v 2 }, ..., (u, V„}) 
is the coordinate vector of u relative to this basis. 



EXAMPLE 3 Coordinate Vector Relative to an Orthonormal Basis 



Let 

VI 



1 = (0.1.0), v 2 =(-f,0,|} ? Y3 =fl r r ^j 



It is easy to check that s= {v 1? y 2? V3} is an orthonormal basis for p 3 with the Euclidean inner product. Express the vector 
u=(l,l,l)asa linear combination of the vectors in S, and find the coordinate vector ( u ) ^. 

Solution 

1 7 

(u, vi} = l, (u, \'2}=--f, and (u ? v 3 J = - 



Therefore, by Theorem 6.3.1 we have 



1 7 

11 = vi -yV2+ yv 3 



that is, 

Ci P i,i) = (o,i,o)-i(-^o,|) 1 y(|.o,|] 



The coordinate vector of u relative to S is 



(u) s = ({u, V!}, (u, v 2 }, (u, v 3 }) = (l, - ± |j 



Remark The usefulness of Theorem 6.3.1 should be evident from this example if we remember that for nonorthonormal 
bases, it is usually necessary to solve a system of equations in order to express a vector in terms of the basis. 

Orthonormal bases for inner product spaces are convenient because, as the following theorem shows, many familiar formulas 
hold for such bases. 

THEOREM 6.3.2 



IfS is an orthonormal basis for an n-dimensional inner product space, and if 


(u) s =(u h u 2 ,...,u yi ) and (v) s = (v h v 2 , - 


•-V H ) 


then 




W IH| = ^ Ha 2 + ... + a 2 



(b) 

(c) 


^(u,v) 
















= l/fui 


-vi) 2 I 


(tii- 


■vV) 2 


+ ■ 


■ + (ti„ - 


-v„) 2 


(U, V} = 


u\ v\ 1 


-U2V2 + - 


" + u ri v n 













The proof is left for the exercises. 

Remark Observe that the right side of the equality in part (a) is the norm of the coordinate vector ( u ) $ with respect to the 
Euclidean inner product on R n , and the right side of the equality in part (c) is the Euclidean inner product of ( u ) ^ and ( v ) $. 
Thus, by working with orthonormal bases, we can reduce the computation of general norms and inner products to the 
computation of Euclidean norms and inner products of the coordinate vectors. 



EXAMPLE 4 Calculating Norms Using Orthonormal Bases 



If R 3 has the Euclidean inner product, then the norm of the vector u = (1, 1, 1) is 

||u|| = (u-u) 1/2 = ,/l2 +1 2 +1 2 = ft 

However, if we let R 3 have the orthonormal basis S in the last example, then we know from that example that the coordinate 
vector of u relative to S is 

The norm of u can also be calculated from this vector using part {a) of Theorem 6.3.2. This yields 

'■"-H-i) 2 '© 2 -/!-' 1 

« 

Coordinates Relative to Orthogonal Bases 

If S— {vi, V2, -.-, v H } is an orthogonal basis for a vector space V, then normalizing each of these vectors yields the 
orthonormal basis 

S' = /_D Z2_ v ' 

*■-> \ i iii i ii?---? 



I vi II IIV2II l|v H | 

Thus, if a is any vector in V, it follows from Theorem 6.3.1 that 



|V1|| ) ||V1|| \ ||¥ 2 || I ||V 2 | 

which, by part (c) of Theorem 6.1.1, can be rewritten as 




(1) 

This formula expresses u as a linear combination of the vectors in the orthogonal basis S. Some problems requiring the use of 
this formula are given in the exercises. 

It is self-evident that if vi, v 2 , and V3 are three nonzero, mutually perpendicular vectors in R 3 , then none of these vectors lies 
in the same plane as the other two; that is, the vectors are linearly independent. The following theorem generalizes this result. 



THEOREM 6.3.3 




Proof Assume that 

k\ vi + £2 ^2 H V k n v H = ^2) 

To demonstrate that £= (vi, V2, ..., v„} is linearly independent, we must prove that jti = k 2 = - = k n = 

For each Vj in 5, it follows from 2 that 

(*ivi I k 2 V2 I "^%Vj) = (0,Vj) = 
or, equivalently, 

*l(*l. ▼]} + ^2(V2, v,} + - + i„(T„, V,} = 

From the orthogonality of S it follows that Ivj, v,-J = when y * 2, so this equation reduces to 

Since the vectors in S are assumed to be nonzero, (v,, v,\ * by the positivity axiom for inner products. Therefore, ^ = Q. 
Since the subscript i is arbitrary, we have k\=k 2 = —k n = (K thus S is linearly independent. 



EXAMPLE 5 Using Theorem 6.3.3 



vi = (0,1,0), v 2 =-|=, 0,-j=l. ^ v 3 = 



-M--L 1 



In Example 2 we showed that the vectors 

fi"" fir ~~ " j ~lV2'"~^ 

form an orthonormal set with respect to the Euclidean inner product on j? 3 . By Theorem 6.3.3, these vectors form a linearly 
independent set, and since ^ 3 is three-dimensional, S= {v\, V2, V3} is an orthonormal basis for ^ 3 by Theorem 5.4.5. 



Orthogonal Projections 

We shall now develop some results that will help us to construct orthogonal and orthonormal bases for inner product spaces. 

In R 2 or p 3 with the Euclidean inner product, it is evident geometrically that if Wis a line or a plane through the origin, then 
each vector u in the space can be expressed as a sum 

u=wi + W2 

where wi is in Wand W2 is perpendicular to W (Figure 6.3.1). This result is a special case of the following general theorem 
whose proof is given at the end of this section. 




(fl) 




ib) 



Figure 6.3.1 



THEOREM 6.3.4 



Projection 


Theorem 


















If Wis a finite-dimensional subspace of an inner 


-product space V, then 


every 


vector u 


in 


V can 


be expressed in 


exactly 


one way as 


















(3) 


n = wi - 


1 W2 






where wi 


is in l/l/ and w 2 is in w^. 



















The vector w\ in the preceding theorem is called the orthogonal projection ofuon W and is denoted by proj^u- The vector 
W2 is called the component ofu orthogonal to W and is denoted by projj^ 1 u. Thus Formula 3 in the Projection Theorem can 
be expressed as 



Since W2 = u — wi it follows that 
so Formula 4 can also be written as 

(Figure 6.3.2). 



u = projjjru I proj^i u 

proj^i u = u-proj^u 
n = proiiHyn I (n-proiarn) 



(4) 



(5) 




a- pi«f n .u 



Figure 6.3.2 



The following theorem, whose proof is requested in the exercises, provides formulas for calculating orthogonal projections. 
THEOREM 6.3.5 



Let Wbe a finite-dimensional subspace of an inner product space V. 

(a) If {y 1? y 2 , ..., v r ) is an orthonormal basis for W, and u is any vector in V, then 

projft7U={u, vi)vi I {ii ? v 2 )v 2 + -+{u ? \>)v, ^ 

(b) If {y 1? y 2? ___ ? y r ) is an orthogonal basis for W, and u is any vector in V, then 

llvill 2 ||v 2 H 2 l|v,|| 2 U) 



EXAMPLE 6 Calculating Projections 



Let gl have the Euclidean inner product, and let Wbe the subspace spanned by the orthonormal vectors Vl = (0, 1, 0) and 
V2 = f — — , 0, — ]• From 6 the orthogonal projection of u = (1, 1, 1) on Wis 

proj^u = (u, vijvi -I- [u, v 2 }v 2 

= (1)(0.1,0) I (_I)(-i ? |) 

\25' ' 25 ) 

The component of u orthogonal to W is 

(4 3 \ /21 28 \ 

25"' 1? " 25~) = fe' ' 25 J 

Observe that proj^ i u is orthogonal to both yi and y 2 , so this vector is orthogonal to each vector in the space W spanned by 
vi and v 2 , as it should be. 

Finding Orthogonal and Orthonormal Bases 

We have seen that orthonormal bases exhibit a variety of useful properties. Our next theorem, which is the main result in this 
section, shows that every nonzero finite-dimensional vector space has an orthonormal basis. The proof of this result is 
extremely important, since it provides an algorithm, or method, for converting an arbitrary basis into an orthonormal basis. 

THEOREM 6.3.6 



Every nonzero finite-dimensional inner product space has an orthonormal basis. 



Proof Let V be any nonzero finite-dimensional inner product space, and suppose that {u 1? 112, -.., u H } is any basis for V. It 
suffices to show that V has an orthogonal basis, since the vectors in the orthogonal basis can be normalized to produce an 
orthonormal basis for V. The following sequence of steps will produce an orthogonal basis (y 1? y 2 , ___, y M } for V. 




Jorgen Pederson Gram (1850-1916) was a Danish actuary. Gram's early education was at village schools supplemented 
by private tutoring. After graduating from high school, he obtained a master's degree in mathematics with specialization in 
the newly developing modern algebra. Gram then took a position as an actuary for the Hafnia Life Insurance Company, 
where he developed mathematical foundations of accident insurance for the company Skjold. He served on the Board of 
Directors of Hafnia and directed Skjold until 1910, at which time he became director of the Danish Insurance Board. 
During his employ as an actuary, he earned a Ph.D. based on his dissertation "On Series Development Utilizing the Least 
Squares Method." It was in this thesis that his contributions to the Gram- Schmidt process were first formulated. Gram 
eventually became interested in abstract number theory and won a gold medal from the Royal Danish Society of Sciences 
and Letters for his contributions to that field. However, he also had a lifelong interest in the interplay between theoretical 
and applied mathematics that led to four treatises on Danish forest management. Gram was killed one evening in a bicycle 
collision on the way to a meeting of the Royal Danish Society. 



Step 1. Let vi =ui- 



Step 2. As illustrated in Figure 6.3.3, we can obtain a vector V2 that is orthogonal to vi by computing the component of 
112 that is orthogonal to the space j^ spanned by y\. We use Formula 7: 



>\ = 



-^u^proj^ 11, A" 




Figure 6.3.3 



V2=U2-proj^ 1 U2=u 2 ' 



llvill 2 



v l 



Of course, if y 2 = 0> then V2 is not a basis vector. But this cannot happen, since it would then follow from the preceding 
formula for V2 that 



The preceding step-by-step construction for converting an arbitrary basis into an orthogonal basis is called the 
Gram-Schmidt process . 



EXAMPLE 7 Using the Gram-Schmidt Process 



Consider the vector space p} with the Euclidean inner product. Apply the Gram-Schmidt process to transform the basis 
vectors U j = (1, 1, 1), 112 = (0, 1, 1), ui = (0, 0, 1) into an orthogonal basis (vi, V2, V3} '■> then normalize the orthogonal 
basis vectors to obtain an orthonormal basis (q 1; q 2 , <13} • 



Solution 

Stepl. vi=ui = (l, 1, 1) 



v 2 = H2 - projWi u 2 = "2 - ~ ^1 

Step 2. "'I II 



= (0,1,1)-|(1,1,1) = (-|,± lj 



(113, vi } (113, v 2 } 
v 3 = u 3 - proj^ 2 113 = 113 - J r^-Vl - ^ r^V 2 

llvill 2 ||v 2 || 2 

StepS. =C o.i i i ) -l C i i i i i ) _lil(-lIIj 

'"' 2' 2 



Thus 

VI = (1,1,1), y 2 =|-lllj, v 3 =(0, -1IJ 

form an orthogonal basis for ^ 3 . The norms of these vectors are 

l|vil| = i/3, ||v 2 || = -^, H v 3ll = -t 
so an orthonormal basis for j? 3 is 




Erhardt Schmidt (1876-1959) was a German mathematician. Schmidt received his doctoral degree from Gottingen 
University in 1905, where he studied under one of the giants of mathematics, David Hilbert. He eventually went to teach 
at Berlin University in 1917, where he stayed for the rest of his life. Schmidt made important contributions to a variety of 
mathematical fields but is most noteworthy for fashioning many of Hilbert's diverse ideas into a general concept (called a 
Hilbert space), which is fundamental in the study of infinite-dimensional vector spaces. Schmidt first described the 
process that bears his name in a paper on integral equations published in 1907. 



Remark In the preceding example we used the Gram-Schmidt process to produce an orthogonal basis; then, after the entire 
orthogonal basis was obtained, we normalized to obtain an orthonormal basis. Alternatively, one can normalize each 
orthogonal basis vector as soon as it is obtained, thereby generating the orthonormal basis step by step. However, this 
method has the slight disadvantage of producing more square roots to manipulate. 

The Gram-Schmidt process with subsequent normalization not only converts an arbitrary basis ( Ul? U2? ___, u M } into an 
orthonormal basis (q 1? q 2? ..., q M } but does it in such a way that for k > 2 the following relationships hold: 

(q 1? q 2? ___, qfl.) is an orthonormal basis for the space spanned by { Ul? U2? ___, \\ k } . 
qjt is orthogonal to the space spanned by ( Ul? U2? ... ? u ^._ 1 } . 

We omit the proofs, but these facts should become evident after some thoughtful examination of the proof of Theorem 6.3.6. 

Q/?-Decomposition 

We pose the following problem. 

Problem If A is an m x w matrix with linearly independent column vectors, and if Q is the matrix with orthonormal column 
vectors that results from applying the Gram-Schmidt process to the column vectors of A, what relationship, if any, exists 
between A and Ql 

To solve this problem, suppose that the column vectors of A are m, n 2 , . . ., u M and the orthonormal column vectors of Q are 
Ql> <12> •••> q H ; thus 

A= [ui|u 2 |-|u„] and Q= [qi|<l2hl<l H ] 



It follows from Theorem 6.3.1 that ui, 112, ••-, u H are expressible in terms of the vectors qi, q2, ..., q H as 

ui = (ui.qi}qi i {*i.m}m+ m -\ («i.qw}iw 

U2 = (u 2 , qijqi I (u 2 , q2}q2+~H ("2. q«}q« 

un = (% qi}qi i ("H.q2}q2 + --i (« H , q H }q H 

Recalling from Section 1.3 that the 7th column vector of a matrix product is a linear combination of the column vectors of the 
first factor with coefficients coming from they'th column of the second factor, it follows that these relationships can be 
expressed in matrix form as 



[ui|u 2 |-|u H ] = [qi|q2|-|q„] 



{ui.qi} 


("2>qi} ■ 


■ (u„, qi} 


(«i> q2} 


{"2, q2} - 


■ {*„, q2} 


(ui. q^} 


(«2. q H } - 


• (u Hl q„} 



or more briefly as 



A = QR 



(8) 



However, it is a property of the Gram-Schmidt process that for ; > 2, the vector qy is orthogonal to ui, 112, •••, ll j-V, thus, all 
entries below the main diagonal of R are zero, 

"(iil.qi} (u 2 , qi} 

{112, q2} 



R = 



(i«. qi} 
(u Hl q2} 











(i«, q H } 



(9) 



We leave it as an exercise to show that the diagonal entries of R are nonzero, so R is invertible. Thus Equation 8 is a 
factorization of A into the product of a matrix Q with orthonormal column vectors and an invertible upper triangular matrix 
R. We call Equation 8 the QR-decomposition of A. In summary, we have the following theorem. 



THEOREM 6.3.7 



Off-Decomposition 

If A is an m x n matrix with linearly independent column vectors, then A can be factored as 

A = QR 

where Q is an mX n matrix with orthonormal column vectors, and R is an WX m invertible upper 
triangular matrix. 



Remark Recall from Theorem 6.2.7 that if A is an n x n matrix, then the invertibility of A is equivalent to linear 
independence of the column vectors; thus, every invertible matrix has a ^-decomposition. 



EXAMPLE 8 Q/?-Decomposition of a 3 x 3 Matrix 



Find the ^-decomposition of 



A = 



1 








1 


1 





1 


1 


1 



Solution 



The column vectors of A are 



"1" 




"0" 




"0" 


1 
1 


. "2 = 


1 
1 


. "3 = 




1 



111 = 



Applying the Gram-Schmidt process with subsequent normalization to these column vectors yields the orthonormal vectors 
(see Example 7) 



<ll = 



1//3 




-2//6 




o" 


1//3 


. "12 = 


1//6 


. 53 = 


-1//2 


1//3 




1//6 




1//2 



and from 9 the matrix R is 



£ = 



Thus the (^-decomposition of A is 



(ui.qij {u 2l qi} (u 3 , qi} 

("2-12} ("3-12} 

(u 3 , q 3 } 



3/j/J 2//3 1//3 

1//2 



"1 0" 




1 1 


= 


1 1 1 





1//3 -2//e 

1//3 1 / i/e" -1//2 

1//3 1//6 1//2 
e 



3//3 2/^3 1//3 
2/^6 1//6 

1//2 



The Role of the (^-Decomposition in Linear Algebra 

In recent years the ^-decomposition has assumed growing importance as the mathematical foundation for a wide variety of 
practical numerical algorithms, including a widely used algorithm for computing eigenvalues of large matrices. Such 
algorithms are discussed in textbooks that deal with numerical linear algebra. 

Additional Proof 



Proof of Theorem 6.3.4 There are two parts to the proof. First we must find vectors wi and W2 with the stated properties, 
and then we must show that these are the only such vectors. 

By the Gram-Schmidt process, there is an orthonormal basis ( V i, \j, -.-, v H } for W. 

Let 

wi = (u, vi Jvi A (u, V2}v 2 A - A (u, v„}v„ 

and 



(10) 



w? = u — Wi 

(11) 

It follows that wi 4- W2 = wi 4- (u — wi) = u> so it remains to show that w\ is in W and W2 is orthogonal to W. But w\ lies in 
W because it is a linear combination of the basis vectors for W. To show that W2 is orthogonal to W, we must show that 
/w2 ? wj = for every vector w in W. But if w is any vector in W, it can be expressed as a linear combination 

w = ii vi + k 2 v 2 H 1- ^h v m 

of the basis vectors v\, V2, ..., v M . Thus 



(12) 



(13) 



(w 2 , w) = (u - wi, w} = (u, w) - (wi, w} 

But 

(u,w} = (u 7 k\ vi H-i2V2 + "H ^v H J 

= jti(u f vi} I k 2 {u ? Y 2 }+- + kn{n ? Yn} 
and by part (c) of Theorem 6.3.2, 

(wi ? w} = (u ? vi}A:i I {u ? Y 2 )k 2 + '~ + {ii ? Vn)kn 
Thus (u ? wj and /w*i, wj are equal, so 12 yields /w2, wj = 0, which is what we want to show. 

To see that 10 and 1 1 are the only vectors with the properties stated in the theorem, suppose that we can also write 

u=w'i 4 w'2 

where w'i is in Wand w'2 i s orthogonal to W. If we subtract from 13 the equation 

n = wi 4- W2 

we obtain 

= (w'i-wi) I (w' 2 -w 2 ) 
or 

wi-w'i=w' 2 -w 2 (14 ) 

Since w 2 and w'2 are orthogonal to W, their difference is also orthogonal to W, since for any vector w in W, we can write 

(w ? w'2 - W2} = (w, w'2 J - (w, W2} = - = 

But w'2 — w r 2 is itself a vector in W, since from 14 it is the difference of the two vectors wi and w'i that lie in the subspace W. 
Thus, w'2 — ^"2 mus t be orthogonal to itself; that is, 

{w ; 2-W2, w'2-w 2 } = 
But this implies that w'2 — W2 = by Axiom 4 for inner products. Thus w'2 = w 2 , and by 14, w 'i = wi- 



Exercise Set 6.3 



Click here for Just Ask! 



Which of the following sets of vectors are orthogonal with respect to the Euclidean inner product on ^ 2 
1. 



2. 



(a) (0,1), (2,0) 

(b) ( - 1 / J2. 1 / 1/2)> (1 / 1/2, 1 / \[2) 

(c) (-\ifi,-\lfi),(\lfi,\ifi) 

(d) (0, 0), (0, 1) 

Which of the sets in Exercise 1 are orthonormal with respect to the Euclidean inner product on p^l 



Which of the following sets of vectors are orthogonal with respect to the Euclidean inner product on £ 3 ? 
3. 



(*) M o.-L. -L-L l . 1 1 



ft- •&)-{&•&• &n f2- -f2 t 



(b) (2 _2 n /2 1 _2W1 2 2) 
\3' 3' 3/ 1^3' 3' 3/ ^3' 3' 3 J 



<C) (.,0,0),(o,-L,-L), (0,0,1) 



W f_L J_ _JL f_L _A_ o 1 



Which of the sets in Exercise 3 are orthonormal with respect to the Euclidean inner product on g}! 
4. 

Which of the following sets of polynomials are orthonormal with respect to the inner product on p^ discussed in Example 
5 - 8 of Section 6.1? 



(a) f-f*+f 2 .f+}*-!*4+f*+f* 2 



(b) j JL X 1 JL X 2 2 
' /2 /2 '* 



Which of the following sets of matrices are orthonormal with respect to the inner product on Mji discussed in Example 7 
6 - of Section 6.1? 



(a) 



"1 0" 







2 
3 




f° "1 




M 


_0 0_ 


? 


1 

3 


2 
3 


? 


2 1 

3 3 


7 


2 2 

3 3 



(b) 



"1 0" 
_0 0_ 


? 


"o r 

0_ 


7 


"0 0" 

1 1_ 


? 


"0 

1 -1_ 



Verify that the given set of vectors is orthogonal with respect to the Euclidean inner product; then convert it to an 
7. orthonormal set by normalizing the vectors. 



(a) (-1,2), (6, 3) 



(b) (1,0, -1), (2, 0,2), (0,5,0) 



(c) (11 1) (1 1 oWll -2A 
^5' 5' 5f [ 2' 2' f \V 3' 3) 



Verify that the set of vectors {(1, 0), (0, 1)} is orthogonal with respect to the inner product /u, v\ — Au\ v\ I U2 V2 on R 2 > 
°' then convert it to an orthonormal set by normalizing the vectors. 

q Verify that the vectors \*i = f — — , — , J, V2 = [— , — , ], V3 = (0, 0, 1) form an orthonormal basis for gl with the 

Euclidean inner product; then use Theorem 6.3.1 to express each of the following as linear combinations of vi, Y2> an< ^ V3. 



(a) (1, -1,2) 

(b) (3, -7,4) 

(c) fl _11) 



10. 



Verify that the vectors 

vi = (1, -1,2, -1), v 2 = (-2, 2, 3, 2), v 3 = (l,2, 0, - 1), v 4 =(l,0, 0, 1) 
form an orthogonal basis for j? 4 with the Euclidean inner product; then use Formula 1 to express each of the following 
linear combinations of vi, \ r 2, V3, and V4. 



(a) (1, 1, 1, 1) 



(b) (/2, -3/2,5/2, -/2) 

(c) f_I 2 _1 4^ 
I 3' 3' 3'3j 



In each part, an orthonormal basis relative to the Euclidean inner product is given. Use Theorem 6.3.1 to find the 
11* coordinate vector of w with respect to that basis. 



(a ^ =(3>7);ui = [^__LJ, U2= (_L,^ 



(b) 



w= ( - 1, 0, 2); iM = (|, - 1, 1), u 2 = (|, 1 - Ij, u 3 = (1 §, I] 



^ Let j? 2 have the Euclidean inner product, and let S= {wi, W2} be the orthonormal basis with w\ = \— , — — I 

(a) Find the vectors u and v that have coordinate vectors (u) ^= (1, 1) and (v) ^= ( — 1, 4). 

(b) Compute ||u||, d(u, v)» an d (u, v} by applying Theorem 6.3.2 to the coordinate vectors ( u ) ^ and ( v ) ^; then check 
the results by performing the computations directly on u and v. 

^ Let £ 3 have the Euclidean inner product, and let s= {wi, W2 ? W3} be the orthonormal basis with wi = [0, — ^, -^ 1 

wa= (1,0,0), and W3 =fe^|\ 



(a) Find the vectors w, v, and w that have the coordinate vectors (u) ^= ( — 2, 1, 2)> (v) ^= (3, 0, — 2), and 

(w)^=(5, -4,1). 

(b) Compute ||v||, d{\\, w)> and (w, v} by applying Theorem 6.3.2 to the coordinate vectors ( u ) ^, ( v ) ^, and ( w ) ^ 
then check the results by performing the computations directly on w, v, and w. 



In each part, S represents some orthonormal basis for a four-dimensional inner product space. Use the given information 
14. to find ||u||, ||v — w||, ||v + w||, and (v, wj. 



(a) (n) s =(-\,2,\,3),(Y) s =(Q, -3, 1, 5), (w)*= (-2, -4,3,1) 

(b) (u)^=(0,0, -1, -1), (v)^=(5,5, -2, -2),(w)- r =(3,0, -3,0) 



15. 



(a) Show that the vectors Vl = (1, -2,3, -4),v 2 = (2, 1, -4, - 3), v 3 = ( - 3, 4, 1, - 2), and V4 = (4, 3, 2, 1) 
form an orthogonal basis for ^ 4 with the Euclidean inner product. 



(b) Use 1 to express u = ( — 1, 2, 3, 7) as a linear combination of the vectors in part (a). 



Let R 2 have the Euclidean inner product. Use the Gram-Schmidt process to transform the basis ( Ul? U2 } into an 
■*■"• orthonormal basis. Draw both sets of basis vectors in the *y-plane. 



(a) ui = (l, -3), u 2 =(2, 2) 

(b) ui = (l,0),u 2 = (3, -5) 



Let p 3 have the Euclidean inner product. Use the Gram-Schmidt process to transform the basis ( Ul? U2 , 113} into an 
1 ' • orthonormal basis. 



(a) m = (1, 1, 1), u 2 = ( - 1, 1, 0), u 3 = (1, 2, 1) 

(b) m = (1, 0, 0), ii 2 = (3, 7, - 2), u 3 = (0, 4, 1) 



Let p 4 have the Euclidean inner product. Use the Gram-Schmidt process to transform the basis ( Ul? U2? n 3? 114} into an 
1° - orthonormal basis. 

111 = (0,2,1,01, m = (1.-1.0,01 in = (1.2, 0,-11 m= (1,0, 0,1) 

Let p} have the Euclidean inner product. Find an orthonormal basis for the subspace spanned by (0, 1, 2), (-1, 0, 1), (-1, 
19 -1,3). 

Let p 3 have the inner product (u, v\ = u\ v\ 4 2^2 ^2 I 3w 3 v 3 . Use the Gram-Schmidt process to transform 
* v - ui = (1, 1, 1), u 2 = (1, 1, 0), u 3 = (1,0, 0) into an orthonormal basis. 



^ The subspace of j? 3 spanned by the vectors m = f— , 0, — — 1 and u 2 = (0, 1, 0) is a plane passing through the origin. 
Express w = (1, 2, 3) in the form y = W1 _|- w ^ , where wi lies in the plane and W2 is perpendicular to the plane. 



22. 



Repeat Exercise 21 with Ul = (]_ ] ; 1) and u 2 = (2, 0, — !)• 



Let R 4 have the Euclidean inner product. Express w = ( — 1 , 2, 6, 0) in the form w = W1 4. w - 7 , where wi is in the space 
23' W spanned by Ul = ( _ L 0, 1,2) and u 2 = (0, 1,0, 1)> and w 2 is orthogonal to W. 



24. 



Find the ^-decomposition of the matrix, where possible. 



(a) 



1 -1 

2 3 



(b) 



"1 


2" 





1 


1 


4 



(c) 



1 


r 


-2 


1 


2 


1 



(d) 



"1 





2" 





1 


1 


1 


2 






(e) 



"1 


2 


r 


1 


1 


1 





3 


1 



(f) 



1 


f 


-1 1 


1 


1 


1 


-1 1 


1 



Let (vi, V2, V3} be an orthonormal basis for an inner product space V. Show that if w is a vector in V, then 

25 ' IN| 2 = (w,V!} 2 I (w,v 2 } 2 f (w,v 3 } 2 . 

Let (vj, V2, ..., v H } be an orthonormal basis for an inner product space V. Show that if w is a vector in V, then 
26 ' M 2 = (w,Til 2 I frr,y 2 l 2 +- + (w,v H l 2 - 



In Step 3 of the proof of Theorem 6.3.6, it was stated that "the linear independence of ( Ul? U2? ... ? u M } ensures that 
V3 ?t 0-" Prove this statement. 



28. 



Prove that the diagonal entries ofR in Formula 9 are nonzero. 



29. (For Readers Who Have Studied Calculus) 

Let the vector space p 2 have the inner product 

Apply the Gram-Schmidt process to transform the standard basis S= { 1, x, x 2 } into an orthonormal basis. (The 
polynomials in the resulting basis are called the first three normalized Legendre polynomials .) 

30. (For Readers Who Have Studied Calculus) 

Use Theorem 6.3.1 to express the following as linear combinations of the first three normalized Legendre polynomials 
(Exercise 29). 



(a) i+* + 4* 2 

(b) 2-7* 2 

(c) 4 + 3* 



31. (For Readers Who Have Studied Calculus) 

Let ^2 have the inner product 

(P, q) = / p(x)q(x)dx 
Apply the Gram-Schmidt process to transform the standard basis S= { 1, x, x } into an orthonormal basis. 

Prove Theorem 6.3.2. 
32. 



Prove Theorem 6.3.5. 



33. 



Discussion 
Discov&ry 



34. 



(a) It follows from Theorem 6.3.6 that every plane through the origin in R 3 must have an 
orthonormal basis with respect to the Euclidean inner product. In words, explain how 
you would go about finding an orthonormal basis for a plane if you knew its equation. 



35. 



(b) Use your method to find an orthonormal basis for the plane x I 2^ — z=0- 



Find vectors x and y in p} that are orthonormal with respect to the inner product 

/u, v\ = 3u i v i I 2^2 ^2 but are not orthonormal with respect to the Euclidean inner product. 



If W is a line through the origin of p 3 with the Euclidean inner product, and if u is a vector in 
i? 3 ' ^ ien Theorem 6.3.4 implies that u can be expressed uniquely as u = w\ I W2 ? where wi is a 



vector in W and W2 is a vector in W • Draw a picture that illustrates this. 

Indicate whether each statement is always true or sometimes false. Justify your answer by 
37. giving a logical argument or a counterexample. 



(a) A linearly dependent set of vectors in an inner product space cannot be orthonormal. 

(b) Every finite-dimensional vector space has an orthonormal basis. 

(c) proj^u is orthogonal to proj^ i u in any inner product space. 

(d) Every matrix with a nonzero determinant has a g^-decomposition. 

What happens if you apply the Gram-Schmidt process to a linearly dependent set of vectors? 
38. 

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



6.4 In this section we shall show how orthogonal projections can be used to 

BEST APPROXIMATION ' solve ceriain approximation problems. The results obtained in this section 

' have a wide variety of applications in both mathematics and science. 
LtAo I oL^UAKto 



Orthogonal Projections Viewed as Approximations 

If P is a point in ordinary 3-space and Wis a plane through the origin, then the point Q in Wthat is closest to P can be 
obtained by dropping a perpendicular from P to W (Figure 6.4.1a). Therefore, if we let u _ Qp, then the distance between P 
and W is given by 

Hu-proij^ull 
In other words, among all vectors w in W, the vector w = proj^u minimizes the distance ||u — w|| (Figure 6.4. lb). 

P 




pni JlK u 



it - pFflJ!,. II 



W 




VI 



pmj w u 



\\ 



(a) Q is ilic point in Vi r closesL to JR 
Figure 6.4.1 



{/}) || u - w B is minimized by w - pmj Ll > u. 



There is another way of thinking about this idea. View u as a fixed vector that we would like to approximate by a vector in 
W. Any such approximation w will result in an "error vector," 

u — w 
that, unless u is in W, cannot be made equal to 0. However, by choosing 

w = projft7U 
we can make the length of the error vector 

||u-w|| = ||u-proj^u|| 

as small as possible. Thus we can describe proj^u as the "best approximation" to u by vectors in W. The following theorem 
will make these intuitive ideas precise. 



THEOREM 6.4.1 



Best Approximation Theorem 

If Wis a finite-dimensional subspace of an inner product space V, and ifu is a vector in V, then proj^rn is the best 
approximation to u from W in the sense that 

||u-proj r u||<||u-w|| 

for every vector w in W that is different from proj^u- 



Proof For every vector w in W, we can write 



u - w = (u - proj p^u) + (proj ^u - w) 



(1) 



But proj^u-w, being a difference of vectors in l/l/, is in l/l/; and u-proj^u is orthogonal to l/l/, so the 
two terms on the right side of 1 are orthogonal. Thus, by the Theorem of Pythagoras (Theorem 
6.2.4), 

||u-w|| 2 = Hu-proitf/ull 2 I Hproitf/u-wll 2 
If w^proj^u, then the second term in this sum will be positive, so 

or, equivalents, 



|u-w|| 2 >||u-proj^u|| 2 



||u-w|| > ||u-proiH7u|| 

■ 

Applications of this theorem will be given later in the text. 

Least Squares Solutions of Linear Systems 

Up to now we have been concerned primarily with consistent systems of linear equations. However, inconsistent linear 
systems are also important in physical applications. It is a common situation that some physical problem leads to a linear 
system A x = b that should be consistent on theoretical grounds but fails to be so because "measurement errors" in the entries 
of A and b perturb the system enough to cause inconsistency. In such situations one looks for a value of x that comes "as 
close as possible" to being a solution in the sense that it minimizes the value of ||j4x — b|| with respect to the Euclidean inner 
product. The quantity || Ax — b || can be viewed as a measure of the "error" that results from regarding x as an approximate 
solution of the linear system A x = b- If the system is consistent and x is an exact solution, then the error is zero, since 
||j4x — b|| = ||0||=0. In general, the larger the value of \\Ax — b||, the more poorly x serves as an approximate solution of the 
system. 

Least Squares Problem Given a linear system Ax = hofm equations in n unknowns, find a vector x, if possible, that 
minimizes ||j4 x — b|| with respect to the Euclidean inner product on R m . Such a vector is called a least squares solution of 

,4x = b. 



Remark To understand the origin of the term least squares, let e = ^4 x — b, which we can view as the error vector that 

results from the approximation x. If e = {e\, &2* ---> e m)> ^ en a ^ east sc l uares solution minimizes ||h|| = L? + ^ H 1- ^m ) 

; hence it also minimizes || e || 2 = e? + ^ H 1 &m' H ence the term least squares. 

To solve the least squares problem, let Wbe the column space of A. For each M x 1 matrix x, the product Ax is a linear 
combination of the column vectors of A. Thus, as x varies over R™, the vector Ax varies over all possible linear combinations 
of the column vectors of A; that is, Ax varies over the entire column space W. Geometrically, solving the least squares 
problem amounts to finding a vector x in R n such that Ax is the closest vector in W to b (Figure 6.4.2). 




,4x 
W - column space of A 

Figure 6.4.2 

A least squares solution x produces the vector Ax i n W closest to b. 



It follows from the Best Approximation Theorem (6.4.1) that the closest vector in W to b is the orthogonal projection of b on 
W. Thus, for a vector x to be a least squares solution of Ax = b, this vector must satisfy 



^x^projft/b 



(2) 



One could attempt to find least squares solutions of A x = b by first calculating the vector projp^b and then solving 2; 
however, there is a better approach. It follows from the Projection Theorem (6.3.4) and Formula 5 of Section 6.3 that 

b — A x = b — proj j^b 
is orthogonal to W. But Wis the column space of A, so it follows from Theorem 6.2.6 that b — Ax lies in the nullspace of A T - 
Therefore, a least squares solution of Ax = b must satisfy 

A T (h-Ax)=0 
or, equivalently, 

,4 7 ,4x = ,4 7 b (3) 

This is called the normal system associated with Ax = b, and the individual equations are called the normal equations 
associated with Ax = b- Thus the problem of finding a least squares solution of Ax = b has been reduced to the problem of 
finding an exact solution of the associated normal system. 

Note the following observations about the normal system: 

The normal system involves n equations in n unknowns (verify). 

+ 
The normal system is consistent, since it is satisfied by a least squares solution of Ax = b- 

■* 
The normal system may have infinitely many solutions, in which case all of its solutions are least squares solutions of 
Ax = h 

From these observations and Formula 2, we have the following theorem. 
THEOREM 6.4.2 







lorrnal system 




least sq 
ast squa 


jares solutions of 






For any linear system Ax = b> the associated i 

is consistent, and all solutions of the 
Moreover, if 1/1/ is the column space o 
orthogonal projection of b on 1/1/ is 


A T Ax = 

normal syste 
f A, and x is 


A T h 

m are 
any le 


Ax 


= b- 




then the 


res solution of Ax 


= h, 






proj^b 


= ^x 







Uniqueness of Least Squares Solutions 

Before we examine some numerical examples, we shall establish conditions under which a linear system is guaranteed to 
have a unique least squares solution. We shall need the following theorem. 



THEOREM 6.4.3 



If A is an mX ft matrix, then the following are equivalent. 

(a) A has linearly independent column vectors. 

(b) ^4 r j^ is invertible. 



Proof We shall prove that (#) > (£) and leave the proof that (£) =^ (#) as an exercise. 

{a) => (6) Assume that A has linearly independent column vectors. The matrix ^4 r ^ has size n x «, so we can prove that this 
matrix is invertible by showing that the linear system A T Ax — o has only the trivial solution. But if x is any solution of this 
system, then Ax is in the nullspace of ^ and also in the column space of A. By Theorem 6.2.6 these spaces are orthogonal 
complements, so part (b) of Theorem 6.2.5 implies that Ax = 0- But A has linearly independent column vectors, so x = by 
Theorem 5.6.8. 

I 

The next theorem is a direct consequence of Theorems Theorem 6.4.2 and Theorem 6.4.3. We omit the details. 
THEOREM 6.4.4 



If A is an m x n matrix with linearly independent column vectors, then for every mx\ matrix b, the linear system Ax = h 
has a unique least squares solution. This solution is given by 

x=(A T A)~ l A T h (4) 

Moreover, if 1/1/ is the column space of A, then the orthogonal projection of b on 1/1/ is 

proj^b^^^^ 7 ^) A T h (5) 



Remark Formulas 4 and 5 have various theoretical applications, but they are very inefficient for numerical calculations. 
Least squares solutions of Av = b are typically found by using Gaussian elimination to solve the normal equations, and the 
orthogonal projection of b on the column space of A, if needed, is best obtained by computing Ax, where x is the least 
squares solution of Ay = b. The g^-decomposition of A is also used to find least squares solutions of Ax = b- 



EXAMPLE 1 Least Squares Solution 



Find the least squares solution of the linear system Ax = b given by 



xi- x 2 = 4 

3xi + 2x2 = 1 

-2xi I 4x2 = 3 

and find the orthogonal projection of b on the column space of A. 

Solution 



Here 



A = 



1 -1 

3 2 

-2 4 



and 1) = 



Observe that A has linearly independent column vectors, so we know in advance that there is a unique least squares solution. 
We have 



A T A = 







r i-i" 




1 -i 


-2 










3 2 


= 


1 '? 


4 










-2 4 










[4] 






1 2 


5 -2 


1 




J — 


-l ; 


: 4 










6 





14-3 
-3 21 



1 
10 



so the normal system A T Ax = A T h\^ n m * s case * s 



Solving this system yields the least squares solution 



14-3 
-3 21 

17 



*1 
*2 



1 
10 



143 



*l- 95 - X 2- 2S5 
From Formula 5, the orthogonal projection of b on the column space of A is 



Ax = 











92 


1 

3 


-l" 
2 


17 
95 
143 


= 


285 
439 
285 


— 2 


4 


285 




94 






57 



Remark The language used for least squares problems is somewhat misleading. A least squares solution of Ay = b is not in 
fact a solution of Ac = b unless Ac = h happens to be consistent; it is a solution of the related system ^4 ^ Ac = ^4 ^ b i ns tead. 



EXAMPLE 2 Orthogonal Projection on a Subspace 



Find the orthogonal projection of the vector u = ( _ 3 ? _ 3 ? g ? 9) on the subspace of ^ spanned by the vectors 

ui = (3,l,Q,l), n 2 = (1.2,l,l), 113 = (-1,0, 2, -1) 



Solution 



One could solve this problem by first using the Gram-Schmidt process to convert {u\, 112, 113} into an orthonormal basis and 
then applying the method used in Example 6 of Section 6.3. However, the following method is more efficient. 

The subspace Wof p^ spanned by u\, 112, and 113 is the column space of the matrix 



,4 = 



1 


-f 


2 





1 


2 


1 


-1 



Thus, if u is expressed as a column vector, we can find the orthogonal projection of u on Why finding a least squares 
solution of the system Ax = u an d then calculating proj^u = Ax from the least squares solution. The computations are as 
follows: The system Ax. = u is 



-1] 




[-3] 




*l 













-■i 




A 2 


— 




2 


*3 




8 


-1 






9 



so 



A T A = 



A T u = 



3 


1 





1 


1 


2 


1 


1 


1 





2 


-1 


3 


1 





1 


1 


2 


1 


1 


1 





2 


-1 



3 1 

1 2 

1 

1 1 



-f 





2 


= 


-1 





11 6 


-4" 


6 7 





-4 


6 



|~-3~ 




[-3] 


-3 








= 


8 


8 




10 


L 9 







The normal system A T Ax — A T \i^ n this case is 



Solving this system yields 



11 6 

6 7 

-4 



x = 



-4" 


"*l" 




"-3" 





*2 


= 


3 


6 


*3 




10 



~*l" 




"-1" 


^2 


= 


2 


*3 




1 



as the least squares solution of jix = u (verify), so 

or, in horizontal notation (which is consistent with the original phrasing of the problem), proj^ii = ( — 2, 3, 4, 0)- 

In Section 4.2 we discussed some basic orthogonal projection operators on R 2 and R 3 (Tables 4 and 5). The concept of an 
orthogonal projection operator can be extended to higher-dimensional Euclidean spaces as follows. 



3 1 


-l" 


"-1" 
2 
1 




' -2 


1 2 
1 



2 


= 


3 
4 


1 1 


-1 








If W is a subspace of R m , then the transformation p- R™ y [f that maps each vector x in R m into its orthogonal 

projection proj j^x in Wis called the orthogonal projection ofR m on W. 



We leave it as an exercise to show that orthogonal projections are linear operators. It follows from Formula 5 that the 
standard matrix for the orthogonal projection of R™ on Wis 



T - _1 A T 



[P] = A(A* A) A 
where A is constructed using any basis for Was its column vectors. 



(6) 



EXAMPLE 3 Verifying Formula (6) 



In Table 5 of Section 4.2 we showed that the standard matrix for the orthogonal projection of g 3 on the ;ty-plane is 



"1 





0" 





1 















(7) 



To see that this is consistent with Formula 6, take the unit vectors along the positive x and y axes as a basis for the *y-plane, 
so that 



,4 = 



We leave it for the reader to verify that A T A is the 2x2 identity matrix; thus Formula 6 simplifies to 



[P] =AA* = 



which agrees with 7. 



"1 


0" 





1 









[1 0] 




[1 0] 




1 






1 


1 


- 


1 


u 








EXAMPLE 4 Standard Matrix for an Orthogonal Projection 



Find the standard matrix for the orthogonal projection P of R 2 on the line / that passes through the origin and makes an angle 
with the positive x-axis. 



Solution 

The line / is a one-dimensional subspace of R 2 . As illustrated in Figure 6.4.3, we can take v = (cos 8, sin 9) as a basis for this 
subspace, so 

cos0 



,4 = 



sin 8 



We leave it for the reader to show that ^ ^4 is the 1 x 1 identity matrix; thus Formula 6 simplifies to 

.2, 



[P] =AA 2 = 



cos 8 
sin 8 



[cos0 sinfl] = 



cos sin0cosS 
sin cos sin 8 



Note that this agrees with Example 6 of Section 4.3. 




(*j 




Figure 6.4.3 

Summary 

Theorem 6.4.3 enables us to add yet another result to Theorem 6.2.7. 



THEOREM 6.4.5 



Equivalent Statements 

If A is an n x n matrix, and ifTj±.R n > R n is multiplication by A, then the following are equivalent. 

(a) A is invertible. 

(b) Ax = has only the trivial solution. 

(c) The reduced row-echelon form of A is j . 



(d) A is expressible as a product of elementary matrices. 



( e ) Ax = h i s consistent for every MX 1 matrix b. 



(f) Ax = h has exactly one solution for every nx\ matrix b. 

(g) det(,4)*0. 

(h) The range of Tj\ is R™. 

(i) Xj\ is one-to-one. 

(j) The column vectors of A are linearly independent. 

(k) The row vectors of A are linearly independent. 

(1) The column vectors of A span R n . 

(m) The row vectors of A span Pj 1 . 

(n) The column vectors of A form a basis for R n . 

(o) The row vectors of A form a basis for R n . 

(p) A has rank n. 

(q) A has nullity 0. 

(r) The orthogonal complement of the nullspace of A is R n . 

(s) The orthogonal complement of the row space of A is {0}. 

(t) A T A is invertible. 



This theorem relates all of the major topics we have studied thus far. 



Exercise Set 6.4 



Click here for Just Ask! 



Find the normal system associated with the given linear system. 



(a) 



"1 


-1" 




2" 






~*l] 






2 


3 


X2 \ 


— 


-1 


4 


5 




5 



(b) 



2 
3 
1 
1 



1 o~ 




"-f 




~*l~| 






1 2 











*2 


^ 




4 5 


*3 




1 


2 4 






2 



In each part, find det(j4 A), and apply Theorem 6.4.3 to determine whether A has linearly independent column vectors. 



(a) 



A = 



(b) 



,4 = 



1 3 2 

2 1 3 
1 1 




2 -1 

1 

1 
4 -5 


3 

1 

-2 

3 



Find the least squares solution of the linear system Ax. = b> and find the orthogonal projection of 6 onto the column spac< 
3. of A. 



(a) 



A= - 



1 f 




7" 


-1 1 


,b = 





-1 2 




-7 



(b) 



,4 = 



"2 


-2" 




2" 


1 


1 


,b = 


-1 


3 


1 




1 



(c) 



A = 



1 

2 1 
1 1 
1 1 



-f 




"6" 


-2 



,b = 



9 


-1 




3 



(d) 



A = 



2 





1 


-2 


2 


-1 





1 



f 




"o" 


2 



,b = 


6 



1 




6 



Find the orthogonal projection of u onto the subspace of p} spanned by the vectors vi and y 2 - 



(a) u= (2, 1, 3); vi = (1, 1, 0), v 2 = (1, 2, 1) 



8. 



(b) u= (1, -6, 1); vi = ( — 1, 2, l),v 2 = (2, 2,4) 
Find the orthogonal projection of u onto the subspace of p 4 spanned by the vectors vi, v 2 , and y 3 . 

(a) u=(6, 3, 9, 6);vi = (2, 1, 1, 1),y 2 = (1,0, 1, l),v 3 = (-2, -1,0, -1) 

(b) u=(-2,0,2,4);vi = (l,l,3,0),v 2 = (-2, - 1, - 2, 1), v 3 = ( - 3, -1,1,3) 

Find the orthogonal projection of u =(5,6,7,2) onto the solution space of the homogeneous linear system 

*1 + *2+*3 =0 

2*2 + *3 +*4= 

Use Formula 6 and the method of Example 3 to find the standard matrix for the orthogonal projection p_ R 2 ► p 2 onto 

(a) the x-axis 

(b) they-axis 

Note Compare your results to Table 4 of Section 4.2. 

Use Formula 6 and the method of Example 3 to find the standard matrix for the orthogonal projection p. pi ^ pi onto 

(a) the ^-plane 

(b) the yz-plane 



Note Compare your results to Table 5 of Section 4.2. 



Show that if w = (a, b, c) is a nonzero vector, then the standard matrix for the orthogonal projection of R 3 onto the line 



"• span {V} is 



P = 



a 2 + b 2 + c 2 



a 2 


ab 


ac 


ab 


b 2 


be 


ac 


be 


A 



Let Wbe the plane with equation 5* — 3y +z = 0- 
10. 

(a) Find a basis for W. 

(b) Use Formula 6 to find the standard matrix for the orthogonal projection onto W. 

(c) Use the matrix obtained in (b) to find the orthogonal projection of a point Pq (*0> 7 0> z o) onto W. 

(d) Find the distance between the point P$(l, — 2, 4) and the plane W, and check your result using Theorem 3.5.2. 

Let Wbe the line with parametric equations 
11# x = 2t, y = -£, z = 4t (- oo <*< oo) 

(a) Find a basis for W. 

(b) Use Formula 6 to find the standard matrix for the orthogonal projection onto W. 

(c) Use the matrix obtained in (b) to find the orthogonal projection of a point Pq(xq, y$, zq) onto W. 

(d) Find the distance between the point ^0(2, 1, — 3) anc * the line W. 

In ^ 3 , consider the line / given by the equations {^ =t,y = t,z = t) and the line m given by the equations 
12- {x = Sr y = 2s—l,z=l}- Let P be a point on /, and let Q be a point on m. Find the values of t and s that minimize the 
distance between the lines by minimizing the squared distance \\P — Q\\ . 

For the linear systems in Exercise 3, verify that the error vector Ax — h resulting from the least squares solution x is 

13. orthogonal to the column space of A. 

Prove: If A has linearly independent column vectors, and if Ax = h is consistent, then the least squares solution of Ax = b 

14. and the exact solution of Ax = h are the same. 

Prove: If A has linearly independent column vectors, and if b is orthogonal to the column space of A, then the least 

15. squares solution of Ax = h is x = 0- 



16. 



Let p- p m > pp be the orthogonal projection of j? m onto a subspace W. 



(a) Prove that [P] 2 = [P]. 



(b) What does the result in part (a) imply about the composition p Q p7 



(c) Show that [P] is symmetric. 



(d) Verify that the matrices in Tables 4 and 5 of Section 4.2 have the properties in parts (a) and (c). 



Let A be an m x n matrix with linearly independent row vectors. Find a standard matrix for the orthogonal projection of 

17. p n onto the row space of A. 

Hint Start with Formula 6. 

The relationship between the current / through a resistor and the voltage drop V across it is given by Ohm's Law Y = IR- 

18. Successive experiments are performed in which a known current (measured in amps) is passed through a resistor of 
unknown resistance R and the voltage drop (measured in volts) is measured. This results in the (/ ? J/") data (0.1, 1), (0.2, 
2.1), (0.3, 2.9), (0.4, 4.2), (0.5, 5.1). The data is assumed to have measurement errors that prevent it from following 
Ohm's Law precisely. 



(a) Set up a 5 x 1 linear system that represents the 5 equations / Q — g$r Q , . . ., / 4 — pjf^. 



(b) Is this system consistent? 



(c) Find the least squares solution of this system and interpret your result. 



Repeat Exercise 18 under the assumption that the relationship between the current / and the voltage drop Vis best 

19. modeled by an equation of the form V = IR + c, where c is a constant offset value. This leads to a 5 x 2 linear system. 

Use the techniques of Section 4.4 to fit a polynomial of degree 4 to the data of Exercise 18. Is there a physical 

20. interpretation of your result? 



Discussion 
Discovery 



21. 



The following is the proof that (£) > ( a ) in Theorem 6.4.3. Justify each line by filling in the 
blank appropriately. 

Hypothesis: Suppose that A is an m x n matrix and ^^^ is invertible. 



Conclusion: A has linearly independent column vectors. 



Proof: 

1 . If x is a solution of Ax = 0, then ^4 T Ax — p. 

2. Thus, x = 0- 



3. Thus, the column vectors of A are linearly independent. 



Let Abean^xw matrix with linearly independent column vectors, and let b be an m x 1 
22. matrix. Give a formula in terms of A and ^ for 



(a) the vector in the column space of A that is closest to b relative to the Euclidean inner 
product; 



(b) the least squares solution of Ax = h relative to the Euclidean inner product; 

(c) the error in the least squares solution of Ax = h relative to the Euclidean inner product; 



(d) the standard matrix for the orthogonal projection of R™ onto the column space of A 
relative to the Euclidean inner product. 



Refer to Exercises 18-20. Contrast the techniques of polynomial interpolation and fitting a line 
23. by least squares. Give circumstances under which each is useful and appropriate. 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



6.5 

CHANGE OF BASIS 



A basis that is suitable for one problem may not be suitable for another, so it is 
a common process in the study of vector spaces to change from one basis to 
another. Because a basis is the vector space generalization of a coordinate 
system, changing bases is akin to changing coordinate axes in R 2 and R 3 . In 
this section we shall study problems related to change of basis. 



Coordinate Vectors 

Recall from Theorem 5.4.1 that if £ = {v\, Y2, ..., v H } is a basis for a vector space V, then each vector v in V can be expressed 
uniquely as a linear combination of the basis vectors, say 

v = kivi | £2 V 2+"" + ^H V H 

The scalars k^ £ 2 > ••■>£„ are the coordinates of v relative to S, and the vector 

(v) s =(k h k 2 ,--,k n ) 

is the coordinate vector of v relative to S. In this section it will be convenient to list the coordinates as entries of an n x 1 matrix. 
Thus we take 



Ms= 






to be the coordinate vector of v relative to S. 

Change of Basis 

In applications it is common to work with more than one coordinate system, and in such cases it is usually necessary to know the 
relationships between the coordinates of a fixed point or vector in the various coordinate systems. Since a basis is the vector 
space generalization of a coordinate system, we are led to consider the following problem. 

Change-of-Basis Problem If we change the basis for a vector space V from some old basis B to some new basis 5', how is the 
old coordinate vector [ v ] „ of a vector v related to the new coordinate vector [ v] s ft 



For simplicity, we will solve this problem for two-dimensional spaces. The solution for ^-dimensional spaces is similar and is left 
for the reader. Let 

B= {ui,u 2 } and B f = {u'iV 2 } 

be the old and new bases, respectively. We will need the coordinate vectors for the new basis vectors relative to the old basis. 
Suppose they are 



That is, 



Now let v be any vector in V, and let 



[ u 'i]* = [iJ ^ [ u '2]*= 

u'i =aui 4- b\\2 



Wb' = 



*1 



(1) 



(2) 



(3) 



be the new coordinate vector, so that 



v = £iu'i -h£2 u '2 



(4) 



In order to find the old coordinates of v, we must express v in terms of the old basis B. To do this, we substitute 2 into 4. This 
yields 

v = £1(13111 +£112) H ^t^ 11 ! + d\\2) 



or 



v ^ (A:i (3 + ^2 ^) u l + (^1 ^ I ^2^) u 2 



Thus the old coordinate vector for v is 



Wb = 



k\b -\-k2d 



which can be written as 



Mb = 



a c 
b d 



k 2 



or, from (3), [v] g = 



a c 
b d 



[v] 



B 



This equation states that the old coordinate vector [ v ] s results when we multiply the new coordinate vector [v] s - on the left by 
the matrix 



P = 



a c 
b d 



The columns of this matrix are the coordinates of the new basis vectors relative to the old basis [see 1]. Thus we have the 
following solution of the change-of-basis problem. 

Solution of the Change-of-Basis Problem If we change the basis for a vector space V from the old basis B= {u\ , 112, . . ., u H } 
to the new basis B f = h\ f \, u'2, ..., u ; H \, then the old coordinate vector [ v ] s of a vector v is related to the new coordinate vector 
[v] R / of the same vector v by the equation 



M* = -P[v]*' 



(5) 



where the columns of P are the coordinate vectors of the new basis vectors relative to the old basis; that is, the column vectors of 
P are 



[■'']*[■'']* [»'»]* 



Transition Matrices 

The matrix P is called the transition matrix from B ! to B; it can be expressed in terms of its column vectors as 

^=[[»'i]b|[» , =]bH[-'-]b] 



(6) 



EXAMPLE 1 Finding a Transition Matrix 



Consider the bases B = {uj , 112 } an d B 1 = {u' 1 , u'2 } for R 2 , where 

ni = (1,0); 112= (0,1); u'j = (1, 1); u' 2 = (2,l) 



(a) Find the transition matrix from B ! to B. 



(b) Use 5 to find [v] 5 if 



[▼]*' = 



Solution (a) 

First we must find the coordinate vectors for the new basis vectors u'i and u'2 relative to the old basis B. By inspection, 

u'i =ui + H2 
u' 2 = 2ui + u 2 

so 



[«i]b' = 



and [ua] fl ' = 



Thus the transition matrix from B ! to B is 



P = 



1 2 
1 1 



Solution (b) 

Using 5 and the transition matrix in part (a) yields 



[v]* = 



1 2 
1 1 



As a check, we should be able to recover the vector v either from [ v ] „ or [f] g*. We leave it for the reader to show that 

-3u'i I 5u' 2 = 7ui + 2u2 = v=(7, 2). 



EXAMPLE 2 A Different Viewpoint on Example 1 



Consider the vectors ui = (l ? 0) ? U2=(0, 1)> u'i = (1, 1)> u'2 = (2, 1)- I n Example 1 we found the transition matrix from the 
basis B f = {u'i, u'2} f° r R 2 t0 the basis B = {u\, 112) • However, we can just as well ask for the transition matrix from B to 5 ; . 
To obtain this matrix, we simply change our point of view and regard 5 1 as the old basis and B as the new basis. As usual, the 
columns of the transition matrix will be the coordinates of the new basis vectors relative to the old basis. 



By equating corresponding components and solving the resulting linear system, the reader should be able to show that 

ui = -"'l I- "'2 
112 = 2 u 1 - u 2 



so 



I>i1b' = 



-l 
l 



and 



[n2] £' = 



2 
-1 



Thus the transition matrix from B to B ' is 



Q = 



-1 2 
1 -1 



If we multiply the transition matrix from B 1 to B obtained in Example 1 and the transition matrix from B to B 1 obtained in 
Example 2, we find 



PQ = 



1 2 
1 1 



-1 2 
1 -1 



1 
1 



= / 



>-l 



which shows that Q = P ■ The following theorem shows that this is not accidental. 



THEOREM 6.5.1 



IfP is the transition matrix from a basis B 1 to a basis Bfor a finite-dimensional vector space V, then P is invertible, and p 1 
is the transition matrix from B to B 1 . 



Proof Let Q be the transition matrix from B to B 1 ■ We shall show that PQ = I and thus conclude that Q = P to complete the 
proof. 



Assume that B = {n\, 112, ..., u H } and suppose that 



^11 c \2 ' 


" c\n 


^21 ^22 " 


" c 2n 


c h1 c h2 " 


" C HH 



PQ = 



From 5, 

[x] B = P[x] B f and [x] B i = G[x] B 
for all x in V. Multiplying the second equation through on the left by P and substituting the first gives 

[x] B = PQ[x] B 
for all x in V. Letting x = \i\ in 7 gives 



(7) 





"l" 




"^11 


^12 




C\y 




"l" 








"f 




"cil" 







= 


^21 ^22 ■' ^2h 





or 







= 


^21 









c n\ c n2 " c nn 














c«l 


Similarly, successively substituting x = 112, • 


., u M in 7 yields 








"^12" 




"o" 




"^Ih" 




"0" 








^22 


= 


1 


? - - -? 


^2h 


= 















c n2 









c nn 




1 









Therefore, PQ = J . 



To summarize, if P is the transition matrix from a basis B ! to a basis 5, then for every vector v, the following relationships hold: 

[v] 5 = f[v] B ' (8) 



[v]fl' = -P _1 [v] 



B 



(9) 



Exercise Set 6.5 



O 



Click here for Just Ask! 



Find the coordinate vector for w relative to the basis S= {ui, U2} f° r B?- 



(a) ui = (l,0),u 2 = (0, l);w=(3, -7) 



(b) ui = (2, -4),u 2 =(3, 8);w=(l, 1) 



(c) ui = (l, l),u 2 = (0, 2)\w=(a,b) 



Find the coordinate vector for v relative to S= {v\, v 2 , V3} • 



(a) v=(2, -l,3);vi(l,0,0),v 2 (2,2,0),v 3 (3,3,3) 



(b) v =(5, -12,3);vi(l,2,3),v 2 (-4,5,6),v 3 (7, -8,9) 



3. 



Find the coordinate vector for/; relative to g= (m, p 2 , 113} • 



(a) i: i = 4-3;r I x ;j, l = l,y 7 =x, v ? = x 2 



(b) p = 2-x I x 2 ; 1Jl = l l x,y 2 =l+x 2 ,y3=x + x 2 



Find the coordinate vector for A relative to s= {A\, A 2 , A3, A4} ■ 



A = 



2 
1 3 



A,= 



-1 1 




A 2 = 



1 1 




A 3 = 





1 



A 4 = 




1 



Consider the coordinate vectors 



[*]$ = 



■1 
4 



[*\s= 



3 

4 



[B] S = 



■8 
7 
6 
3 



(a) Find w if S is the basis in Exercise 2(a). 



(b) (b) Find q if S is the basis in Exercise 3(a). 



(c) (c) Find B if S is the basis in Exercise 4. 



6. 



Consider the bases B = {u\, 112) an d S f = {v\ ? V2} for j^ 2 , where 



111 = 



V 



. u 2 = 


"0" 
_1_ 


. v l = 


~2~ 
_1_ 



= and v 2 



(a) Find the transition matrix from B f to B. 



(b) Find the transition matrix from BtoB f . 



(c) Compute the coordinate vector [V] p where 



w = 



3 
-5 



and use 9 to compute [w] F -. 
(d) Check your work by computing [w] ^ ■■ directly. 



Repeat the directions of Exercise 6 with the same vector w but with 



«1 



"2" 




4" 




"1" 




"-1" 


_2_ 


■ U2 = 


_-l_ 


. vi = 


_3_ 


■ V2 = 


_-l_ 



Consider the bases B = {111,112,113} and B 1 = {v1.v2.v3} for R 3 , where 



ui = 



' -2 




"-3" 




r 




"-6" 




"-2" 




"-2" 





. u 2 = 


2 


. u 3 = 


6 


. y i = 


-6 


,V2 = 


-6 


. v 3 = 


-3 


-3 




-1 




-1 









4 




7 



(a) Find the transition matrix from B to B* . 



(b) Compute the coordinate vector [ w ] B , where 



w = 



-5 

3 

-5 



and use 9 to compute [w] B * 



(c) Check your work by computing [w] gt directly. 



Repeat the directions of Exercise 8 with the same vector h>, but with 



ui = 



~2 




2 




"1" 




3~ 




r 




"-1" 


1 
1 


> U2 = 


-1 
1 


> 13 = 


2 
1 


. ▼! = 


1 
-5 


.▼2 = 


i 

-3 


. v 3 = 



2 



10. 



Consider the bases B = (pi,P2) and 5' = {q^q^} for p^, where 

pi = 6 4- 3*, p2 = 10 4 2*, qi = 2, q2 = 3 + 2* 



(a) Find the transition matrix from B f to B. 



(b) Find the transition matrix from B to 5 ; . 



(c) Compute the coordinate vector [p] ^, where p = — 4 I x 9 and use 9 to compute [p] s * 



(d) Check your work by computing [p ] B * directly. 



11. 



Let V be the space spanned by f 1 = sm ^ and f 2 = cos x- 



(a) Show that gl = 2 sin* I cos x and g 2 = 3 cos * form a basis for V. 



(b) Find the transition matrix from B ' = (gi, g2J ^°B= {f i ? f 2} 



(c) Find the transition matrix from B to B f . 



(d) Compute the coordinate vector [h] p where h = 2 sin* — 5 cos *> and use 9 to obtain [h] s *. 



(e) Check your work by computing [h]g* directly. 



If P is the transition matrix from a basis B f to a basis 5, and Q is the transition matrix from B to a basis C, what is the 
12. transition matrix from 5' to C? What is the transition matrix from C to 5'? 



Refer to Section 4.4. 



13. 



(a) Identify the bases for p^ used for interpolation in the standard form (found by using the Vandermonde system), the 
Newton form, and the Lagrange form, assuming ^ Q — _ ], x ^ — p, and ^ = 1- 



(b) What is the transition matrix from the Newton form basis to the standard basis? 



To write the coordinate vector for a vector, it is necessary to specify an order for the vectors in the basis. If P is the transition 
14. matrix from a basis B f to a basis 5, what is the effect on P if we reverse the order of vectors in B from w\, . . ., v H to v H , . . ., 
vi ? What is the effect on P if we reverse the order of vectors in both B f and Bl 



Discussion 
Discovery 



Consider the matrix 



15. 




(a) P is the transition matrix from what basis B to the standard basis S= {e\, 62, 23} f° r R^ 



(b) P is the transition matrix from the standard basis S= {v\, *2> *3) t0 w hat basis B for p^l 



The matrix 



16. 




is the transition matrix from what basis B to the basis {(1, 1, 1), (1, 1, 0), (1, 0, 0)} for j? 3 ? 



17. 



If [w] ^ = w holds for all vectors w in R n , what can you say about the basis Bl 



Indicate whether each statement is always true or sometimes false. Justify your answer by giving a 
18. logical argument or a counterexample. 



(a) Given two bases for the same inner product space, there is always a transition matrix from 
one basis to the other basis. 



(b) The transition matrix from B to B is always the identify matrix. 



(c) Any invertible n x n matrix is the transition matrix for some pair of bases for R n . 
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In this section we shall develop properties of square matrices with orthonormal 
column vectors. Such matrices arise in many contexts, including problems 
ORTHOGONAL MATRICES involving a change from one orthonormal basis to another. 



6.6 



Matrices whose inverses can be obtained by transposition are sufficiently important that there is some terminology associated with 
them. 



DEFINITION 



A square matrix A with the property 



is said to be an orthogonal matrix. 



A~ l =A T 



It follows from this definition that a square matrix A is orthogonal if and only if 



AA T = A T A = I 



(1) 



In fact, it follows from Theorem 1.6.3 that a square matrix A is orthogonal if either jy[ T _ j or j[ T j± _ /. 



EXAMPLE 1 A 3 x 3 Orthogonal Matrix 



The matrix 



is orthogonal, since 



A T A = 



A = 



1 1 

7 7 

£ 1 

7 7 

1 £ 

7 7 



3 


6 


2 


3 


2 


7 


7 


7 


7 


7 


2 


3 


6 


6 


3 


7 


7 


7 


7 


7 


6 


2 


3 


2 


6 


7 


7 


7 


7 


7 





"1 0" 


= 


1 




1 



EXAMPLE 2 A Rotation Matrix Is Orthogonal 



Recall from Table 6 of Section 4.2 that the standard matrix for the counterclockwise rotation of R 2 through an angle is 



,4 = 



cos 9 — sin0 
sin0 cos 9 



This matrix is orthogonal for all choices of 0, since 



A T A = 



cos 9 sm.9 
— sm.9 cos9 



cos0 


— sin 




"1 0" 


sin0 


cos0 




1_ 



In fact, it is a simple matter to check that all of the "reflection matrices" in Tables 2 and 3 and all of the "rotation matrices" in 
Tables 6 and 7 of Section 4.2 are orthogonal matrices. 

# 

Observe that for the orthogonal matrices in Examples Example 1 and Example 2, both the row vectors and the column vectors form 
orthonormal sets with respect to the Euclidean inner product (verify). This is not accidental; it is a consequence of the following 
theorem. 



THEOREM 6.6.1 



The following are equivalent for an 


nxft niatrix A. 






(a) 


A is orthogonal. 










(b) 


The row vectors of A form an orthonormal set in R n 


with the Euclidean inner product. 


(c) 


The column vectors 


of A form an orthonormal set in 


R n with the Euclidean 


inner product. 



Proof We shall prove the equivalence of (a) and (b) and leave the equivalence of (a) and (c) as an exercise. 

[a] ** [b) The entry in the ith row and jth column of the matrix product AA T is the dot product of the ith row vector of A and the jth 
column vector of ^4^. But except for a difference in notation, the jth column vector of ^ is the jth row vector of A. Thus, if the row 
vectors of A are r\,r2, • • - ? r n , then the matrix product AA T can be expressed as 



AA T = 



ri-n n-r 2 
r 2 - ri r 2 - r 2 

r„-ri r M -r 2 






Thus AA T = I if an d only if 



and 



which are true if and only if ( ri? r2? 



n-ri=r2-r 2 = - = r M -r M =l 

r 2 -ry = when i^ j 
, r M } is an orthonormal set in £". 



Remark In light of Theorem 6.6.1, it would seem more appropriate to call orthogonal matrices orthonormal matrices. However, 
we will not do so in deference to historical tradition. 



The following theorem lists some additional fundamental properties of orthogonal matrices. The proofs are all straightforward and 
are left for the reader. 



THEOREM 6.6.2 



(a) 


The inverse of an orthogonal matrix is orthogonal. 


(b) 


A product of orthogonal matrices 


is orthogonal. 


(c) If A is orthogonal, then det(j4) = 


1 or det(^) = - 1. 



EXAMPLE 3 det(/l) = 1 1 for an Orthogonal Matrix A 



The matrix 

r 1//2 1//2 

-It ^2 1/^2 

is orthogonal since its row(and column) vectors form orthonormal sets in ^ 2 . We leave it for the reader to check that det(j4) = 1- 
Interchanging the rows produces an orthogonal matrix for which det(-d) = — 1- 



Orthogonal Matrices as Linear Operators 

We observed in Example 2 that the standard matrices for the basic reflection and rotation operators on R 2 and p^ are orthogonal. 
The next theorem will help explain why this is so. 

THEOREM 6.6.3 



If A is an n x n matrix, then the following are equivalent. 

(a) A is orthogonal 

(b) Ldx|| = ||x|| for all x in R n . 

(c) Ax: - Ay = x - yfor all x andy in R n . 



Proof We shall prove the sequence of implications (a) => (b) => (c) => (a)- 

{3) => (jb) Assume that A is orthogonal, so that A^A = I- Then, from Formula 8 of Section 4.1, 



\\Ax\\ = (Ay- Ax) U2 = (x-A T Ax) = (x-x) 1/2 = ||x|| 



(6) =&■ (c) Assume that ||j4x|| = ||x|| for all x in R n \. From Theorem 4.1.6 we have 

Ax.Ay = ±\\Ax I Ay\\ 2 -±\\Ax-Ay\\ 2 = ±\\A(x I y)|| 2 -±\\A(x-y)\\ 2 



(c) => (a) Assume that j4x - j4y = x - y for all x and y in i?". Then, from Formula 8 of Section 4.1, we have 
which can be rewritten as 



x-y = x-A T Ay 



x-(A T Ay-y) = or x-(A T A-I)y = 



Since this holds for all x in R n , it holds in particular if 



from which we can conclude that 



x-(A T A-I)y, so (A T A-I)y- (A T A-I)y = 



(A T A-I)y = Q (2) 



(why?). Thus 2 is a homogeneous system of linear equations that is satisfied by every y in R n . But this implies that the coefficient 
matrix must be zero (why?), so A^A — / an d, consequently, A is orthogonal. 

I 

If T: R n * R n is multiplication by an orthogonal matrix A, then Tis called an orthogonal operator on ,£". It follows from parts 

(a) and (b) of the preceding theorem that the orthogonal operators on R n are precisely those operators that leave the lengths of all 
vectors unchanged. Since reflections and rotations of ^ and ^ 3 have this property, this explains our observation in Example 2 that 
the standard matrices for the basic reflections and rotations of R 2 and R^ are orthogonal. 

Change of Orthonormal Basis 

The following theorem shows that in an inner product space, the transition matrix from one orthonormal basis to another is 
orthogonal. 

THEOREM 6.6.4 



IfP is the transition matrix from one orthonormal basis to another orthonormal basis for an inner product space, then P is an 
orthogonal matrix; that is, 



Proof Assume that Vis an ^-dimensional inner product space and that P is the transition matrix from an orthonormal basis B ' to an 
orthonormal basis B. To prove that P is orthogonal, we shall use Theorem 6.6.3 and show that ||Pkl| = ||x|| for every vector x in R n . 

Recall from Theorem 6.3.2a that for any orthonormal basis for V, the norm of any vector u in Vis the same as the norm of its 
coordinate vector in R n with respect to the Euclidean inner product. Thus for any vector u in V, we have 

INI = II[»]b'II = II["]bII 

or 

INI = ll[«]fl'll = ll^[«]fl'll (3) 

where the first norm is with respect to the inner product on V and the second and third are with respect to the Euclidean inner 
product on R n . 



Now let x be any vector in R n , and let u be the vector in V whose coordinate vector with respect to the basis B f is x; that is, 
[u] jpj' = x. Thus, from 3, 

||u|| = ||x|| = HP.II 
which proves that P is orthogonal. 



EXAMPLE 4 Application to Rotation of Axes in 2-Space 



j„t 



In many problems a rectangular ^-coordinate system is given, and a new x y -coordinate system is obtained by rotating the xy 
-system counterclockwise about the origin through an angle Q. When this is done, each point Q in the plane has two sets of 
coordinates: coordinates (^ y) relative to the *y-system and coordinates (V, y f ) relative to the *y -system (Figure 6.6.1a). 

By introducing unit vectors ui and 112 along the positive x- and y-axes and unit u ' and u ' along the positive x*- and y '-axes, we can 
regard this rotation as a change from an old basis B = {u\, 112) to a new basis B f = {\\[ , u^ } (Figure 6.6. lb). Thus, the new 
coordinates (x , y ) and the old coordinates (x,y) of a point Q will be related by 

j' 



y 



= p- 



(4) 



where P is the transition from 5' to B. To find P we must determine the coordinate matrices of the new basis vectors u ' and u ' 
relative to the old basis. As indicated in Figure 6.6.1c, the components of u ' in the old basis are r r^ f) and sm ft so 



K] B = 



cosO 
sin 9 



Similarly, from Figure 6.6Ad, we see that the components of u ' in the old basis are cos (9 + tt / 2) = — sin 9 and 

sin(0 + 7r/2) = cosft so 



Me = 



— sm9 
cos9 









la) 



lb) 




L<U&0 



U) 



Figure 6.6.1 

Thus the transition matrix from B f to B is 



mm 




P = 



cos 9 — sin0 
sin9 cos 9 



Observe that P is an orthogonal matrix, as expected, since B and B'* are orthonormal bases. Thus 



P- 1 =P T = 



cos 9 sm9 

— sm.9 cos9 



so 4 yields 



cosS sin9 
— sin# cosS 



or, equivalently, 

x = x cos9-\-y sm9 
y f = —xsinO I ycosO 

For example, if the axes are rotated 9 = ^ / 4, then since 



sinf = cos^ = ] 
4 4 



& 



Equation 5 becomes 



1 1 




{2 {2 


' x~ 


1 1 


y 


/2 {2 





Thus, if the old coordinates of a point Q are (*, j) = (2, — 1), then 

1 1 

1 1 
so the new coordinates of Q are (x', y') = (1 / y/2, — 3 / yf2). 



2 
-1 



1 

ft 

3 
/2 



(5) 



Remark Observe that the coefficient matrix in 5 is the same as the standard matrix for the linear operator that rotates the vectors of 
gl through the angle — (Table 6 of Section 4.2). This is to be expected since rotating the coordinate axes through the angle with 
the vectors of pp- kept fixed has the same effect as rotating the vectors through the angle — with the axes kept fixed. 



EXAMPLE 5 Application to Rotation of Axes in 3-Space 



Suppose that a rectangular *yz-coordinate system is rotated around its z-axis counterclockwise (looking down the positive z-axis) 
through an angle (Figure 6.6.2). If we introduce unit vectors u\, 112, and 113 along the positive x- 9 y- 9 and z-axes and unit vectors u ' , 
u ' and u ' along the positive x\ y , and z f axes, we can regard the rotation as a change from the old basis B = {u\ r 112, 113} to the 
new basis B f = {\\[ , u^, u^ } . In light of Example 4, it should be evident that 



[<]*= 



cos 9 

sm.9 





and Mb = 



— sin 9 

cos 9 






Figure 6.6.2 



Moreover, since u ' extends 1 unit up the positive z'-axis, 



[«3] B = 



Thus the transition matrix from B f to B is 



P = 



and the transition matrix from B to B f is 



cos 9 — sinr? 

sin0 cosf9 

1 

cosS sinr? 

P~ { = -sin0 cos0 

1 

(verify). Thus the new coordinates (x , y , z ) of a point 2 can be computed from its old coordinates (x 7 y, z) by 



cosS sinS 

— sin5 cosf9 

1 



Exercise Set 6.6 



® 



Click here for Just Ask! 



1. 



(a) Show that the matrix 



A = 



4 





b 




9 


4 


25 


5 


12 


3 


25 


5 



_ 3 
5 

25 

16 
25 



is orthogonal in three ways: by calculating a t A, by using part (b) of Theorem 6.6.1, and by 
using part (c) of Theorem 6.6.1. 



(b) Find the inverse of the matrix A in part (a). 



(a) Show that the matrix 



A = 



2 2 

3 3 

2 1 

3 3 

I 1 

3 3 



is orthogonal. 



(b) Let 7 : J? 3 > J? 3 be multiplication by the matrix A in part (a). Find T(-x) for the vector x = ( — 2, 3, 5)- Using the 

Euclidean inner product on R 3 , verify that \\T(x) \\ = \\x\\- 



Determine which of the following matrices are orthogonal. For those that are orthogonal, find the inverse. 



(a) 



1 
1 



(b) 



l/j/2 -1//2 
l/j/2 1/^2 



(c) 



1 1//2 

1 

1/^2 



(d) 



-1//2 1/^6 1/^3 

-2/^6 1/^3 

1/^2 1/^6 1/^3 



(e) 



1 


1 


1 


1 


2 


2 


2 


2 


1 


5 


1 


1 


2 


6 


6 


6 


1 


1 


1 


5 


2 


6 


6 


6 


1 


1 


5 


1 


2 


6 


6 


6 



(f) 



10 

1/^3 -1/2 

1/^3 1 

1/^3 1/2 



(a) Show that if A is orthogonal, then ^^ is orthogonal. 



(b) What is the normal system for Ax = h when A is orthogonal? 



5. 



Verify that the reflection matrices in Tables 2 and 3 of Section 4.2 are orthogonal. 



J^J 



Let a rectangular x y -coordinate system be obtained by rotating a rectangular ^-coordinate system counterclockwise through 
the angle Q = 3?r / 4- 



J^f 



(a) Find the x y -coordinates of the point whose ^-coordinates are (-2, 6). 



/ / 



(b) Find the ^-coordinates of the point whose x y -coordinates are (5, 2). 



Repeat Exercise 6 with 0=n/3- 



J.JJ 



Let a rectangular x y z -coordinate system be obtained by rotating a rectangular jtyz-coordinate system counterclockwise about 

o 

°* the z-axis (looking down the z-axis) through the angle Q = ^ / 4. 



j^jj 



(a) Find the x y z -coordinates of the point whose *yz-coordinates are (-1, 2, 5). 



KJ-f 



(b) Find the ^^-coordinates of the point whose x y z -coordinates are ( 1, 6, -3). 



Repeat Exercise 8 for a rotation of Q — ^ / 3 counterclockwise about the y-axis (looking along the positive j-axis toward the 

9. origin). 

Repeat Exercise 8 for a rotation of Q = 3^ / 4 counterclockwise about the x-axis (looking along the positive x-axis toward the 

10. origin). 



11. 



J^JJ 



(a) A rectangular x y z -coordinate system is obtained by rotating an ^z-coordinate system counterclockwise about the 
y-axis through an angle (looking along the positive j-axis toward the origin). Find a matrix A such that 



V" 




' x~ 


y' 


= A 


y 


J 




z 



where (x,y,z) and (x*,y*,z*) are the coordinates of the same point in the xyz- and *V*'-system 
respectively. 



(b) Repeat part (a) for a rotation about the x-axis. 



A rectangular x**y 'V- coordinate system is obtained by first rotating a rectangular *yz-coordinate system 60° 
counterclockwise about the z-axis (looking down the positive z-axis) to obtain an x V '^'-coordinate system, and then rotating 
the x y z -coordinate system 45° counterclockwise about the y -axis (looking along the positive y -axis toward the origin). Find 
a matrix A such that 



' x 1 ' 




' x~ 


y" 


= A 


y 


z" 




z 



13. 



where {x,y,z) and (x j ,z ) are the xyz- and x y z - coordinates of the same point. 

What conditions must a and b satisfy for the matrix 

a + b b — a 
a — b b \ a 

to be orthogonal? 



14. 



Prove that a 2 x 2 orthogonal matrix A has one of two possible forms: 



,4 = 



cos 8 — sinO 
sinS cos 8 



cos 8 sm8 

sinS —cos 8 



or A = 
sinft cosfl 

where < 8 < 2tt 

Hint Start with a general 2x2 matrix ^4 = (a^), and use the fact that the column vectors form an orthonormal set in R 2 . 



15. 



(a) Use the result in Exercise 14 to prove that multiplication by a 2 x 2 orthogonal matrix is either a rotation or a rotation 
followed by a reflection about the x-axis. 



(b) Show that multiplication by A is a rotation if det(-d) = 1 an d that a rotation followed by a reflection if det(-d) = — 1- 



Use the result in Exercise 15 to determine whether multiplication by A is a rotation or a rotation followed by a reflection about 
16. the x-axis. Find the angle of rotation in either case. 



(a) 



,4 = 



-1//2 1/^2 
-1//2 -It ^2 



(b) 



,4 = 



-1/2 /3/2 
^3/2 1/2 



17. 



The result in Exercise 15 has an analog for 3 x 3 orthogonal matrices: It can be proved that multiplication by a 3 x 3 orthogor 
matrix A is a rotation about some axis if det(j4) = 1 an d is a rotation about some axis followed by a reflection about some 
coordinate plane if det(-d) = — 1- Determine whether multiplication by A is a rotation or a rotation followed by a reflection. 



(a) 



A = 



(b) 



A = 



3 
1 


2 
7 


6 
7 


6 

7 


3 
7 


2 
7 


2 
7 


6 
7 


3 
7 


2 


3 


6 


7 


7 


7 


3 


6 


2 


7 


7 


7 


6 


2 


3 


7 


7 


7 



Use the fact stated in Exercise 17 and part (b) of Theorem 6.6.2 to show that a composition of rotations can always be 
18. accomplished by a single rotation about some appropriate axis. 



19. 



Prove the equivalence of statements (a) and (c) in Theorem 6.6.1. 



Discussion 

A linear operator on R 2 is called rigid if it does not change the lengths of vectors, and it is called 
2 "' angle preserving if it does not change the angle between nonzero vectors. 



(a) Name two different types of linear operators that are rigid. 

(b) Name two different types of linear operators that are angle preserving. 

(c) Are there any linear operators on R 2 that are rigid and not angle preserving? Angle preserving 
and not rigid? Justify your answer. 



Referring to Exercise 20, what can you say about det(j4) if A is the standard matrix for a rigid linear 
21* operator on j? 2 ? 



22. 



Find a, b, and c such that the matrix 





a 1//2 


-1//2 


A = 


h 1//6 


\l {& 




c 1//3 


1/^3 



is orthogonal. Are the values of a, b, and c unique? Explain. 
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Chapter 6 



Supplementary Exercises 



1. 



Let p^ have the Euclidean inner product. 



(a) Find a vector in j? 4 that is orthogonal toui = (l,0,0,0) and U4 = (0 ? 0, 0, 1) and makes equal angles with 

u 2 = (Q,l,Q,Q)and U3 = (Q,Q,l,Q). 



(b) Find a vector x = (jc i , #2* *3> *4) °f length 1 that is orthogonal to ui and 114 above and such that the cosine of the 
angle between x and 112 is twice the cosine of the angle between x and 113. 



2. 



Show that if x is a nonzero column vector in i? M , then the n x # matrix 

,4 = / H -^ T xx r 



is both orthogonal and symmetric. 



3. 



Let j4x = be a system of m equations in n unknowns. Show that 



is a solution of the system if and only if the vector K = {x\,X2, ---,x n ) is orthogonal to every row vector of A in the 
Euclidean inner product on R n . 



Use the Cauchy-Schwarz inequality to show that if a\, ^2, •••, a n are positive real numbers, then 



5. 



Show that if jc and y are vectors in an inner product space and c is any scalar, then 



kx + y|| 2 = c 2 ||x|| 2 + 2c(x,yj | ||y| 



6. 



Let R^ have the Euclidean inner product. Find two vectors of length 1 that are orthogonal to all three of the vectors 
ui = (l,l, -l),u 2 = C-2, -l,2)and„ 3 = (_l,0,l). 



7. 



Find a weighted Euclidean inner product on R" such that the vectors 



yi = (1.0,0.„,0) 
v 2 = (0,/2,0 f _0) 
v 3 = (0 ? 0,/3,... ? 0) 

y„ = (0, 0, 0, ..., fit) 
form an orthonormal set. 

Is there a weighted Euclidean inner product on $2 for which the vectors (1,2) and (3,-1) form an orthonormal set? 
"" Justify your answer. 

Prove: If Q is an orthogonal matrix, then each entry of Q is the same as its cofactor if det(g) = 1 and is the negative of its 

9 - cofactor if det(g) = - 1- 

If u and v are vectors in an inner product space V, then w, v, and u _ v can be regarded as sides of a "triangle" in V (see 

10- the accompanying figure). Prove that the law of cosines holds for any such triangle; that is, 
||u — y|| 2 = ||u|| 2 I ||v|| 2 — 2||u|| ||v||cos 0, where is the angle between u and v. 




Figure Ex-10 



11. 

(a) In R 3 the vectors (k, 0, 0), (0, k, 0), and (0, 0, k) form the edges of a cube with diagonal (fc ? fc ? jfc) (Figure 3.3.4). 

Similarly, in R™ the vectors 

(£,0,0,...,Q), (0,^,0,..., 0),..., (0,0,0,...,^) 

can be regarded as edges of a "cube" with diagonal (k, k, k, ..., k)> Show that each of the 
above edges makes an angle of o with the diagonal, where cos 0= 1 / fii. 

(b) (For Readers Who Have Studied Calculus). What happens to the angle /} in part (a) as the dimension of R™ 
approaches |- oo ? 



Let u and v be vectors in an inner product space. 
12. 



(a) Prove that ||u|| = ||v|| if and only if u 4. v and u _ v are orthogonal. 

(b) Give a geometric interpretation of this result in pi with the Euclidean inner product. 



Let u be a vector in an inner product space V, and let (y 1? V 2, ---, v„} be an orthonormal basis for V. Show that if ct 3 is 
13" the angle between u and v ]? then 

cos cti + COS Q2 + '" + cos q m — 1 



Prove: If /u, y\, and /u, vU are two inner products on a vector space V, then the quantity /u ? v) = (u ? v\h + (u ? vU is 
• also an inner product. 

Show that the inner product on j? H generated by any orthogonal matrix is the Euclidean inner product. 
15. 

Prove part (c) of Theorem 6.2.5. 
16. 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



Chapter 6 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Section 6.1 

Tl. (Weighted Euclidean Inner Products) See if you can program your utility so that it produces the value of a weighted 
Euclidean inner product when the user enters n, the weights, and the vectors. Check your work by having the program do 
some specific computations. 



T2. (Inner Product on M22) See ^ y° u can program your utility to produce the inner product in Example 7 when the user enters 
the matrices U and V. Check your work by having the program do some specific computations. 



T3. (Inner Product on C\a, h]) If you are using a CAS or a technology utility that can do numerical integration, see if you can 
program the utility to compute the inner product given in Example 9 when the user enters a, Z?, and the functions f(x) and 
g(x). Check your work by having the program do some specific calculations. 

Section 6.3 

Tl. (Normalizing a Vector) See if you can create a program that will normalize a nonzero vector v in R n when the user enters v. 



T2. (Gram-Schmidt Process) Read your documentation on performing the Gram-Schmidt process, and then use your utility to 
perform the computations in Example 7. 



T3. (g^-decomposition) Read your documentation on performing the Gram-Schmidt process, and then use your utility to 
perform the computations in Example 8. 

Section 6.4 

Tl. (Least Squares) Read your documentation on finding least squares solutions of linear systems, and then use your utility to 
find the least squares solution of the system in Example 1 . 



T2. (Orthogonal Projection onto a Subspace) Use the least squares capability of your technology utility to find the least 



squares solution x of the normal system in Example 2, and then complete the computations in the example by computing Ax- 
If you are successful, then see if you can create a program that will produce the orthogonal projection of a vector u in ^ 4 onto 
a subspace W when the user enters u and a set of vectors that spans W. 

Suggestion As the first step, have the program create the matrix A that has the spanning vectors as columns. 
Check your work by having your program find the orthogonal projection in Example 2. 
Section 6.5 



Tl. 



(a) Confirm that B\ = {u\, u 2 , 113, 114, 115} and 5 2 = {v\, v 2 , V3, V4, vj) are bases for j? 5 , and find both transition 
matrices. 

111 = (3, 1,3,2,6) vi = (2, 6, 3, 4, 2) 

u 2 = (4, 5, 7, 2, 4) v 2 = (3, 1,5,8,3) 

113 = (3, 2, 1,5,4) v 3 = (5, 1,2, 6,7) 

114 =(2, 9, 1,4,4) v 4 =(8,4, 3, 2, 6) 
u 5 = (3,3,6,6,7) v 5 = (5,5,6,3,4) 

(b) Find the coordinate vectors with respect to S\ an d 5 2 °^ w — 0> 1* 1* 1* 1)* 
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7 



CHAPTER 



Eigenvalues, Eigenvectors 



INTRODUCTION: If >4 is an a? x a? matrix and x is a vector in R n , then Ax is also a vector in R n , but usually there is no 
simple geometric relationship between xand Ax- However, in the special case where x is a nonzero vector and Ax is a scalar 
multiple of x, a simple geometric relationship occurs. For example, if A is a 2 x 2 matrix, and if x is a nonzero vector such that 
Ax is a scalar multiple of x, say Ax = Ax, then each vector on the line through the origin determined by x gets mapped back 
onto the same line under multiplication by A. 

Nonzero vectors that get mapped into scalar multiples of themselves under a linear operator arise naturally in the study of 
vibrations, genetics, population dynamics, quantum mechanics, and economics, as well as in geometry. In this chapter we will 
study such vectors and their applications. 
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7.1 

EIGENVALUES AND 
EIGENVECTORS 



In Section 2.3 we introduced the concepts of eigenvalue and eigenvector. In 
this section we will study those ideas in more detail to set the stage for 
applications of them in later sections. 



Review 

We begin with a review of some concepts that were mentioned in Sections 2.3 and 4.3. 



DEFINITION 



If A is an w x n matrix, then a nonzero vector x in J?" is called an eigenvector of A if Ax is a scalar multiple of x; that is, if 

Ax = Ax 
for some scalar \. The scalar A is called an eigenvalue of A, and x is said to be an eigenvector of A corresponding to A- 



In pi and ^ 3 , multiplication by A maps each eigenvector x of A (if any) onto the same line through the origin as x. Depending 
on the sign and the magnitude of the eigenvalue A corresponding to x, the linear operator Ax = Ax compresses or stretches x by a 
factor of \, with a reversal of direction in the case where \ is negative (Figure 7.1.1). 



Ui) 0£A£1 
Figure 7.1.1 




ih) A £ 





(d) k<-] 



EXAMPLE 1 Eigenvector of a 2 x 2 Matrix 



The vector x = 



is an eigenvector of 



corresponding to the eigenvalue \ = 3, since 



A = 



3 
8 -1 



,4x = 



3 
3 -1 



= 3x 



To find the eigenvalues of an B x n matrix A, we rewrite Ax = Ax as 

Ax = \Ix 



or, equivalently, 



(XI - A)x = 



(1) 



For A to be an eigenvalue, there must be a nonzero solution of this equation. By Theorem 6.4.5, Equation 1 has a nonzero 
solution if and only if 

This is called the characteristic equation of A; the scalars satisfying this equation are the eigenvalues of A. When expanded, the 
determinant det(A/ — j4) is always a polynomial/? in \\, called the characteristic polynomial of A. 

It can be shown (Exercise 15) that if A is an n x n matrix, then the characteristic polynomial of A has degree n and the coefficient 
of A" is 1 ; that is, the characteristic polynomial p (A) of an n x n matrix has the form 

;?(A) = det(A/ - ^)= A" + C1A"" 1 +»■ + <;„ 
It follows from the Fundamental Theorem of Algebra that the characteristic equation 

A" + C1 A H " 1 +■« + *:„ = 
has at most n distinct solutions, so an n x n matrix has at most n distinct eigenvalues. 

The reader may wish to review Example 6 of Section 2.3, where we found the eigenvalues of a 2 x 2 matrix by solving the 
characteristic equation. The following example involves a 3 x 3 matrix. 



EXAMPLE 2 Eigenvalues of a 3 3 Matrix 



Find the eigenvalues of 



A = 






1 0" 





1 


4 


-17 8 



Solution 

The characteristic polynomial of A is 

det(A7 - A) = det 



A -1 
A -1 
-4 17 A-8 



= A 3 -8A 2 i 17A-4 



The eigenvalues of A must therefore satisfy the cubic equation 



A 3 -8A 2 + 17A-4 = 



(2) 



To solve this equation, we shall begin by searching for integer solutions. This task can be greatly simplified by exploiting the 
fact that all integer solutions (if there are any) to a polynomial equation with integer coefficients 

must be divisors of the constant term, c n . Thus, the only possible integer solutions of 2 are the divisors of -4, that is, ±1, ±2, +4. 
Successively substituting these values in 2 shows that A = 4 is an integer solution. As a consequence, A — 4 must be a factor of 
the left side of 2. Dividing A — 4 into A 3 — 8A 2 I 17A — 4 shows that 2 can be rewritten as 

(A-4HA 2 -4A I 1) = 
Thus the remaining solutions of 2 satisfy the quadratic equation 



A^-4A+1 = 
which can be solved by the quadratic formula. Thus the eigenvalues of A are 

\ = 4, A = 2 + ^3, and A = 2-^3 



Remark In practical problems, the matrix A is usually so large that computing the characteristic equation is not practical. As a 
result, other methods are used to obtain eigenvalues. 



EXAMPLE 3 Eigenvalues of an Upper Triangular Matrix 



Find the eigenvalues of the upper triangular matrix 



A = 



an a 12 ai3 a 14 

^22 ^23 ^24 

1333 1334 

^44 



Solution 

Recalling that the determinant of a triangular matrix is the product of the entries on the main diagonal (Theorem 2.1.3), we 
obtain 



det(A/-j4) = det 



A — a\\ —^i2 —^i3 —^i4 

A-^22 -^23 -^24 

A — 1333 —1334 

A-^44 

= (A -an) (A -fl 2 2) (A -333) (A -fl-44) 
Thus, the characteristic equation is 

(A - ai 1 ) (A - aw) (X - a-ri) (X - au) = 
and the eigenvalues are 

A = flii ? X = a^ ? A = <3-^ ? A = (344 
which are precisely the diagonal entries of A. 

The following general theorem should be evident from the computations in the preceding example. 



THEOREM 7.1.1 



If A is an nxn triangular matrix (upper triangular, lower triangular, or diagonal), then the eigenvalues of A are the entries 
on the main diagonal of A. 



EXAMPLE 4 Eigenvalues of a Lower Triangular Matrix 



By inspection, the eigenvalues of the lower triangular matrix 



,4 = 



1 
2 

-1 



5 -3 - 



are A = i, A = |, and A = - i 

Complex Eigenvalues 

It is possible for the characteristic equation of a matrix with real entries to have complex solutions. In fact, because the 
eigenvalues of an n x n matrix are the roots of a polynomial of precise degree n, every n x n matrix has exactly n eigenvalues if 
we count them as we count the roots of a polynomial (meaning that they may be repeated, and may occur in complex conjugate 
pairs). For example, the characteristic polynomial of the matrix 

-2 -1" 
5 2 



,4 = 



is 



det(A/-j4) = det 



= A^+1 



"A I 2 1 

. -5 A-2_ 

so the characteristic equation is A 2 I 1 = 0, the solutions of which are the imaginary numbers A = i and A = — i- Thus we are 
forced to consider complex eigenvalues, even for real matrices. This, in turn, leads us to consider the possibility of complex 
vector spaces — that is, vector spaces in which scalars are allowed to have complex values. Such vector spaces will be 
considered in Chapter 10. For now, we will allow complex eigenvalues, but we will limit our discussion of eigenvectors to the 
case of real eigenvalues. 

The following theorem summarizes our discussion thus far. 



THEOREM 7.1.2 



Equivalent Statements 

If A is an ftxtt matrix and A is a real number, then the following are equivalent. 

(a) A is an eigenvalue of A. 

(b) The system of equations (XI — A) x ; = has nontrivial solutions. 

(c) There is a nonzero vector x in R n such that Ax = Ax- 

(d) A is a solution of the characteristic equation det(A7 — A) = 0- 



Finding Eigenvectors and Bases for Eigenspaces 

Now that we know how to find eigenvalues, we turn to the problem of finding eigenvectors. The eigenvectors of A 
corresponding to an eigenvalue A are the nonzero vectors x that satisfy Ax = Ax- Equivalently, the eigenvectors corresponding to 
A are the nonzero vectors in the solution space of (A/ — A)x. = — that is, in the null space of XI — A- We call this solution space 
the eigenspace of A corresponding to A- 



EXAMPLE 5 Eigenvectors and Bases for Eigenspaces 



Find bases for the eigenspaces of 



A = 



0-2 

1 2 1 
1 3 



Solution 

The characteristic equation of matrix A is A 3 — 5A 2 I 8A — 4 = 0, or, in factored form, (A— 1 ) (A — 2) = (verify); thus the 
eigenvalues of A are Ai = 1 and A2 3 = 2, so there are two eigenspaces of A. 



By definition, 



x = 



*1 
*3 



is an eigenvector of A corresponding to A if and only if x is a nontrivial solution of (A/ — A)x = — that is, of 

A 2 

-1 A-2 -1 
-1 A-3 



r* 1 " 




"0" 


*2 


= 





l* 3 








(3) 



If A = 2, then 3 becomes 



2 


2" 


~*l" 




"0" 


-1 


-1 


*2 


= 





-1 


-1 


*3 








Solving this system using Gaussian elimination yields (verify) 

^1 = -s, X2=t 9 X3=s 

Thus, the eigenvectors of A corresponding to A = 2 are the nonzero vectors of the form 



x = 



' — s~ 




' —s~ 




"0" 




"-1" 




"o" 


t 


= 





+ 


t 


= s 





+ t 


1 


s 




s 









1 








Since 



and 



are linearly independent, these vectors form a basis for the eigenspace corresponding to \ = 2- 
If X = h then 3 becomes 



r* 1 " 







*2 


= 





|_*3 








1 2 

-1 -1 -1 

__-l -2 

Solving this system yields (verify) 

xi = -2s, X2=s> *2 = s 
Thus the eigenvectors corresponding to A = 1 are the nonzero vectors of the form 



\ -2s'\ 




' -2 




" -2 


s 


= s 


1 


so that 


1 


s 




1 




1 



is a basis for the eigenspace corresponding to A = 1 • 

4 

Notice that the zero vector is in every eigenspace, although it isn't an eigenvector. 

Powers of a Matrix 

Once the eigenvalues and eigenvectors of a matrix A are found, it is a simple matter to find the eigenvalues and eigenvectors of 
any positive integer power of A; for example, if A is an eigenvalue of A and x is a corresponding eigenvector, then 

A 2 x = A(Ax) = A(Ax) = A(Ax) = A(Ax) = A 2 x 

which shows that a 2 is an eigenvalue of a 2 an d thatx is a corresponding eigenvector. In general, we have the following result. 



THEOREM 7.1.3 



If k is a positive integer, \ is an eigenvalue of a matrix A, andx is a corresponding eigenvector, then \& is an eigenvalue of 
A k andx is a corresponding eigenvector. 



EXAMPLE 6 Using Theorem 7.1 .3 



In Example 5 we showed that the eigenvalues of 



A = 



0-2 

1 2 1 
1 3 



are A = 2 and A = 1, so from Theorem 7.1.3, both A = 2 7 = 128 an d A = l 7 = 1 are eigenvalues of a 7 - We also showed that 

and 



are eigenvectors of A corresponding to the eigenvalue A = 2, so from Theorem 7.1.3, they are also eigenvectors of A 7 
corresponding to A = 2 7 = 128- Similarly, the eigenvector 

" -2 
1 
1 

of A corresponding to the eigenvalue A = 1 is also an eigenvector of a 7 corresponding to a = 1 7 = 1 • 



Eigenvalues and Invertibility 



The next theorem establishes a relationship between the eigenvalues and the invertibility of a matrix. 
THEOREM 7.1.4 



A square matrix A is invertible if and only if\ = Q is not an eigenvalue of A. 



Proof Assume that A is an n x n matrix and observe first that \ = Q is a solution of the characteristic equation 

A H + ciA"- 1 +"- + c H = 

if and only if the constant term c H is zero. Thus it suffices to prove that A is invertible if and only if 
c H *0- But 

det(A/-,4)=A H + ciA H " 1 +- + c H 

or, on setting A = Q, 

det(-^4)=c H or ( - l^det^) =c^ 

It follows from the last equation that det(^) = if and only if Cn = 0, and this in turn implies that A is 
invertible if and only if Cn ± 0- 



EXAMPLE 7 Using Theorem 7.1 .4 



The matrix A in Example 5 is invertible since it has eigenvalues A = 1 and A = 2. neither of which is zero. We leave it for the 
reader to check this conclusion by showing that det(j4) ^ 0- 

Summary 

Theorem 7.1.4 enables us to add an additional result to Theorem 6.4.5. 
THEOREM 7.1.5 



Equivalent Statements 

If A is an Hxtt matrix, and ifTj\ : R n > R n is multiplication by A, then the following are equivalent. 

(a) A is invertible. 

(b) Ax = has only the trivial solution. 

(c) The reduced row-echelon form of A is J. 



n 



(d) A is expressible as a product of elementary matrices. 

(e) Ax = h is consistent for every n x 1 matrix b. 

(f) Ax = b has exactly one solution for every n x 1 matrix b. 

(g) det<>4)*0. 

(h) The range ofT^ is R n . 

(i) Xj\ i s one-to-one. 

(j) The column vectors of A are linearly independent. 

(k) The row vectors of A are linearly independent. 

(1) The column vectors of A span R n . 

(m) The row vectors of A span R n . 

(n) The column vectors of A form a basis for R n . 

(o) The row vectors of A form a basis for R n . 

(p) A has rank n. 

(q) A has nullity 0. 

(r) The orthogonal complement of the nullspace of A is R n . 

(s) The orthogonal complement of the row space of A is fOJ. 

(t) A T A is invertible. 

(u) A = is not an eigenvalue of A. 



This theorem relates all of the major topics we have studied thus far. 



Exercise Set 7.1 



&■ 



Click here for Just Ask! 



Find the characteristic equations of the following matrices: 



(a) 



3 
8 -1 



(b) 



10 -9 
4 -2 



(c) 



3 
4 



(d) 



-2 
1 



-7 
2 



(e) 








(f) 



1 

1 



Find the eigenvalues of the matrices in Exercise 1. 



3. 



Find bases for the eigenspaces of the matrices in Exercise 1. 



Find the characteristic equations of the following matrices: 



(a) 



4 





r 


-2 


1 





-2 





i 



(b) 



-5 
-1 

1 -2 



(c) 



-2 1 

-6 -2 

19 5 -4 



(d) 



1 


f 


1 3 





4 13 


-1 



(e) 



5 





1" 


1 


1 





7 


1 






(f) 



5 


6 


2 





-1 


-8 


1 





-2 



Find the eigenvalues of the matrices in Exercise 4. 



6. 



Find bases for the eigenspaces of the matrices in Exercise 4. 



Find the characteristic equations of the following matrices: 



(a) 





1 
1 




2 


o" 


1 





-2 








1 



(b) 



10 
4 








-2 -7 

1 2 



Find the eigenvalues of the matrices in Exercise 7. 



9. 



Find bases for the eigenspaces of the matrices in Exercise 7. 



10. 



By inspection, find the eigenvalues of the following matrices: 



(a) 



-1 6 
5 



(b) 



(c) 



3 







2 7 







4 8 


1 




1 
3 








- 


1 

" 3 











1 












11. 



Find the eigenvalues of jfi for 



A = 



1 


3 


7 


11 





l 

2 


3 


8 











4 











2 



12. 



Find the eigenvalues and bases for the eigenspaces of A 25 for 



A = 



-1 -2 -2 

1 2 1 

-1-1 



Let A be a 2 x 2 matrix, and call a line through the origin of J? 2 invariant under A if j4 x lies on the line when x does. Find 
*■*' equations for all lines in £ 2 , if any, that are invariant under the given matrix. 



(a) ,_ 



4 -1 
2 1 



^i! = 



1 
-1 



(c) A _ 



2 3 
2 



14. 



Find det(j4) given that A has j?(A) as its characteristic polynomial. 



3 ^2 



(a) ;?(A)=A J -2A J + A + 5 



4 \3 



(b) ^(A)=A 4 -A J + 7 



Hint See the proof of Theorem 7. 1 .4. 



15. 



Let A be an n x n matrix. 



(a) Prove that the characteristic polynomial of A has degree n. 



(b) Prove that the coefficient of A" in the characteristic polynomial is 1. 



16. 



Show that the characteristic equation of a 2 x 2 matrix A can be expressed as A — tr(y4)A -f det(y4) = 0, where tr(j4) is the 
trace of A. 



Use the result in Exercise 16 to show that if 



17. 



A = 



a b 
c d 



then the solutions of the characteristic equation of A are 

A = ir(fl + df)± l /( fl _^) 2 | 4bc 

Use this result to show that A has 

(a) two distinct real eigenvalues if (a — d) 2 I 4bc > 

(b) two repeated real eigenvalues if (a — d) I 4bc = 

(c) complex conjugate eigenvalues if (a — d) 2 I 4bc < 



18. 



Let A be the matrix in Exercise 17. Show that if (a — d) I 4bc > and b ^ , then eigenvectors of A corresponding to the 
eigenvalues 



Al = 2[ tfl + L ^ W(a-^) 2 ' 4bc ' ^ d A2 = i[ tfl ' d ^-ll(a-d) 2 I 4bc 



are 



-b 
a — X\ 



and 



-b 
a-\2 



respectively. 



19. 



Prove: If a, b, c, and d are integers such that a \ b = c \ d, then 



A = 



a b 
c d 



has integer eigenvalues — namely, Ai = a + b and A2 = a — c- 

Prove: If A is an eigenvalue of an invertible matrix A, and x is a corresponding eigenvector, then 1 / A is an eigenvalue of 
20- j4 _1 , and x is a corresponding eigenvector. 



Prove: If A is an eigenvalue of A, x is a corresponding eigenvector, and s is a scalar, then A — s is an eigenvalue of ^ _ S L 
21. and x is a corresponding eigenvector. 



22. 



Find the eigenvalues and bases for the eigenspaces of 



A = 



-2 2 3 
-2 3 2 
-4 2 5 



Then use Exercises 20 and 21 to find the eigenvalues and bases for the eigenspaces of 



(a) A 



-1 



(b) ,4-3/ 

(c) A + 21 



23. 



(a) Prove that if A is a square matrix, then A and ^ have the same eigenvalues. 
Hint Look at the characteristic equation det(A7 — ^4) = 0- 

(b) Show that A and a T need not have the same eigenspaces. 

Hint Use the result in Exercise 18 to find a 2 x 2 matrix for which A and A J have different eigenspaces. 

Discussion 

DiscOV&FV Indicate whether each statement is always true or sometimes false. Justify your answer by giving 



24. a logical argument or a counterexample. In each part, A is an n x n matrix. 



(a) If Ax = Ax for some nonzero scalar A, then x is an eigenvector of A. 



(b) If A is not an eigenvalue of A, then the linear system (M — A)x = has only the trivial 
solution. 



(c) A = is an eigenvalue of A, then j[ 2 is singular. 



(d) If the characteristic polynomial of A is ,p(A) = A M + 1, then A is invertible. 



Suppose that the characteristic polynomial of some matrix A is found to be 
25- p(X) = (A — 1 ) (A — 3) (A — 4) . In each part, answer the question and explain your reasoning. 



(a) What is the size of A? 

(b) Is A invertible? 

(c) How many eigenspaces does A have? 



The eigenvectors that we have been studying are sometimes called right eigenvectors to 
26. distinguish them from left eigenvectors, which are n x 1 column matrices x that satisfy 

x T A = {jl.x T for some scalar fi. What is the relationship, if any, between the right eigenvectors and 
corresponding eigenvalues A of A and the left eigenvectors and corresponding eigenvalues ft of A? 
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7 a In this section we shall be concerned with the problem of finding a basis for 

■^ R n that consists of eigenvectors of a given n x n matrix A. Such bases can be 

DIAGON ALIZATION use d to study geometric properties of A and to simplify various numerical 

computations involving A. These bases are also of physical significance in a 
wide variety of applications, some of which will be considered later in this text. 



The Matrix Diagonalization Problem 

Our first objective in this section is to show that the following two problems, which on the surface seem quite different, are 
actually equivalent. 

The Eigenvector Problem Given an n x n matrix A, does there exist a basis for R™ consisting of eigenvectors of A? 



The Diagonalization Problem (Matrix Form) Given an n x n matrix A, does there exist an invertible matrix P such that p _1 ap 
is a diagonal matrix? 

The latter problem suggests the following terminology. 



DEFINITION 



A square matrix A is called diagonalizable if there is an invertible matrix P such that p~ l AP is a diagonal matrix; the matrix 
P is said to diagonalize A. 



The following theorem shows that the eigenvector problem and the diagonalization problem are equivalent. 
THEOREM 7.2.1 



If A is an ftxtt matrix, then the following are equivalent. 

(a) A is diagonalizable. 

(b) A has n linearly independent eigenvectors. 



Proof (3) => (jb) Since A is assumed diagonalizable, there is an invertible matrix 





Pll P\2 ■ 


" Pin 


p = 


P21 P22 ■ 


" P2n 




Pn\ PrO. m 


" Pnr 


such that p~ x AP is diagonal, say p~ l AP = D, where 






"Ai ■- 


■ " 


D = 


A 2 ■■ 


• 




■■ 


■ ^H 



It follows from the formula p~ x AP = D that AP = AD] that is, 



AP = 



Pll 


P12 


P21 


P22 


Pnl 


PrO. 



Pin 
P2n 



Ai 

A 2 





&1PU X 2P12 
MP21 ^2P22 



^■nPln 
^nP2n 



MPnl &2Pn2 ~ ^nPy 



(1) 



If we now let pi, P2/ ■■■, Pm denote the column vectors of P, then from 1, the successive columns of AP 
are Aipi, A 2 P2, ■ ■■, A„p H . However, from Formula 6 of Section 1.3, the successive columns of AP are Ap\, 
Ap2i-"i A*n- Thus we m ust have 



jlpi=Aipi, ^4p2 = A 2 P2,---, JlpM = A„p„ 



(2) 



Since P is invertible, its column vectors are all nonzero; thus, it follows from 2 that Ai, A 2 , ■ ■■, A H are 

eigenvalues of A, and pi, P2, ■■-, Vn are corresponding eigenvectors. Since P is invertible, it follows 

from Theorem 7.1.5 that pi, P2, ■■-, p H are linearly independent. Thus A has n linearly independent 

eigenvectors. 

{h) => (a) Assume that A has n linearly independent eigenvectors, p \ , P2, ... , p H > w ^h corresponding eigenvalues Ai , A2» • • • > A M? 

and let 



P = 



PU P12 - Pin 
P2\ P22 - P2n 



Pnl Pn2 L,J Pnn 
be the matrix whose column vectors are p i , p2, . . . , p H . By Formula 6 of Section 1 .3, the column vectors of the product AP are 

^P1>^P2----^Ph 
But 

^4pi=Aipi ? ^4p2 = A2P2 ? --- ? ^4p H = A H p H 
so 



AP = 



^lPll ^2P12 
MP21 ^2P22 



^■nPln 
^nP2n 



MPnl ^2Pn2 '" >>-nPr 



Pll P12 '" Pin 
P21 P22 - P2n 

Pnl Pn2 "" Pnn 



Ai 
A 2 





(3) 



= PD 



where D is the diagonal matrix having the eigenvalues \\, A 2 > ■•■,A M on the main diagonal. Since the column vectors of P are 
linearly independent, P is invertible. Thus 3 can be rewritten as p~ l AP = D> that is, A is diagonalizable. 



Procedure for Diagonalizing a Matrix 



The preceding theorem guarantees that an n x n matrix A with n linearly independent eigenvectors is diagonalizable, and the 
proof provides the following method for diagonalizing A. 



Step 1 . Find n linearly independent eigenvectors of A , say p i , p 2 , . . . , p H • 

Step 2. Form the matrix P having p j , P2, . . . , p „ as its column vectors. 

Step 3. The matrix p~^AP will then be diagonal with Aj, A^ • • ->A H as its successive diagonal entries, where A; is the 
eigenvalue corresponding to p, for j = \ r 2, ..., «• 



In order to carry out Step 1 of this procedure, one first needs a way of determining whether a given nxn matrix A has n linearly 
independent eigenvectors, and then one needs a method for finding them. One can address both problems at the same time by 
finding bases for the eigenspaces of A. Later in this section, we will show that those basis vectors, as a combined set, are linearly 
independent, so that if there is a total of n such vectors, then A is diagonalizable, and the n basis vectors can be used as the 
column vectors of the diagonalizing matrix P. If there are fewer than n basis vectors, then A is not diagonalizable. 



EXAMPLE 1 Finding a Matrix PThat Diagonalizes a Matrix A 



Find a matrix P that diagonalizes 



A = 



0-2 

1 2 1 
1 3 



Solution 

From Example 5 of the preceding section, we found the characteristic equation of A to be 



(A-l)(A-2r = 



and we found the following bases for the eigenspaces: 

-1" 
2: pi= 

1 



V2 



A=l: 



There are three basis vectors in total, so the matrix A is diagonalizable and 

-1 -2 

P = 



diagonalizes A. As a check, the reader should verify that 



P~ l AP = 



1 2 
1 1 1 
1 -1 



1 

1 



0-2 

1 2 1 
1 3 



P3 



-2 
1 
1 



-1 


-2" 




"2 0" 


1 


1 


= 


2 


1 


1 




1 



There is no preferred order for the columns of P. Since the iih diagonal entry of p ~^AF is an eigenvalue for the ith column 
vector of P, changing the order of the columns of P just changes the order of the eigenvalues on the diagonal of P~^AP • Thus, 



if we had written 



in Example 1, we would have obtained 



P = 



1 


-2 0" 





1 1 


1 


1 



P~ 1 AP = 



"2 





0" 





1 











2 



EXAMPLE 2 A Matrix That Is Not Diagonalizable 



Find a matrix P that diagonalizes 



A = 



1 





0" 


1 


2 





-3 


5 


2 



Solution 



The characteristic polynomial of A is 



det(A/-A) = 



A-l 

-1 A-2 
3 -5 A-2 



= (A-l)(A-2)' 



so the characteristic equation is 

(A-l)(A-2) 2 = 

Thus the eigenvalues of A are Ai = 1 and A2,3 = 2. We leave it for the reader to show that bases for the eigenspaces are 

1 



A=l: 



Pl = 



A=2: 



P2 = 



Since A is a 3 x 3 matrix and there are only two basis vectors in total, A is not diagonalizable. 

Alternative Solution 

If one is interested only in determining whether a matrix is diagonalizable and is not concerned with actually finding a 
diagonalizing matrix P, then it is not necessary to compute bases for the eigenspaces; it suffices to find the dimensions of the 
eigenspaces. For this example, the eigenspace corresponding to \ = ] is the solution space of the system 









0" 


"*l" 




"o" 


1 


-1 





*2 


= 





3 


-5 


-1 


*3 








The coefficient matrix has rank 2(verify). Thus the nullity of this matrix is 1 by Theorem 5.6.3, and hence the solution space is 
one-dimensional. 



The eigenspace corresponding to \ = 2 is the solution space of the system 









~*1~ 







1 





*2 


= 





3 


-5 


*3 








This coefficient matrix also has rank 2 and nullity 1 (verify), so the eigenspace corresponding to A = 2 is also one-dimensional. 

Since the eigenspaces produce a total of two basis vectors, the matrix A is not diagonalizable. 

■ 

There is an assumption in Example 1 that the column vectors of P, which are made up of basis vectors from the various 
eigenspaces of A, are linearly independent. The following theorem addresses this issue. 



THEOREM 7.2.2 



If vi, V2, ..., Vfr are eigenvectors of A corresponding to distinct eigenvalues Ap A2> •••> X& then fv\, V2> -•> ^kl * s a H near fy 
independent set. 



Proof Let v\, v^ • • ■» Y k ^ e eigenvectors of A corresponding to distinct eigenvalues X^Xj, • • •> X^ We shall assume that vj, V2, 
. . ., vjt are linearly dependent and obtain a contradiction. We can then conclude that v 1? y 2 , . . ., v^ are linearly independent. 

Since an eigenvector is nonzero by definition, { Vl } is linearly independent. Let r be the largest integer such that 
{y\, V2, .-., v r ) is linearly independent. Since we are assuming that (y 1? y 2? ___, v^} is linearly dependent, r satisfies 1 < r < k. 
Moreover, by definition of r, (y 1? V2? _._, v r+ i) is linearly dependent. Thus there are scalars c 1? C2, • •-, Cr+h not a U zero > such 
that 



c i vi + c i v 2 + »■ + c r +i v r+ i = 



AV7=\?V7, 



AVy + ] =A r -|-lV | .-|-1 



Multiplying both sides of 4 by A and using 

A-1 =Aivi, 
we obtain 

c i Ai vi 4- ^2A2V 2 + »■ + ^ r+ i A r+ i \>+i = 

Multiplying both sides of 4 by A^+i and subtracting the resulting equation from 5 yields 

ci(Ai -A r _n)vi +c^(A^-A r _n)v^H \-c y (X y - A r +i)v r = 

Since { y ^ ? y 2? .-., v r ) is a linearly independent set, this equation implies that 

ci(Xi-X r +i)=c 2 (X2-\+l)=- = Cr(\-\+l) =0 
and since X\, A2> • ■ ■> A r+ i are distinct by hypothesis, it follows that 

c\ =C2 = '" = c r = 



Substituting these values in 4 yields 



Since the eigenvector y r _ l _ 1 is nonzero, it follows that 



^r+iv r +l =0 



Equations 6 and 7 contradict the fact that C \, C2, ...,c r +\ are not all zero; this completes the proof. 



(4) 



(5) 



(6) 



(7) 



Remark Theorem 7.2.2 is a special case of a more general result: Suppose that Ai> A2> • • •> A^ are distinct eigenvalues and that 
we choose a linearly independent set in each of the corresponding eigenspaces. If we then merge all these vectors into a single 
set, the result will still be a linearly independent set. For example, if we choose three linearly independent vectors from one 
eigenspace and two linearly independent vectors from another eigenspace, then the five vectors together form a linearly 
independent set. We omit the proof. 



As a consequence of Theorem 7.2.2, we obtain the following important result. 



THEOREM 7.2.3 



If an # x n matrix A has n distinct eigenvalues, then A is diagonalizable. 



Proof If vi, V2, • • ., v H are eigenvectors corresponding to the distinct eigenvalues \±, \ 2 , . . .,A H , then by Theorem 7.2.2, y\, Y2 ? 
. . ., v H are linearly independent. Thus A is diagonalizable by Theorem 7.2.1. 



EXAMPLE 3 Using Theorem 7.2.3 



We saw in Example 2 of the preceding section that 



A = 






1 0" 





1 


4 


-17 8 



has three distinct eigenvalues: \ = 4, A=2 + i/3> an( i A = 2 — \[3 - Therefore, A is diagonalizable. Further, 

"4 

P~ 1 AP= 2+/3 

2-/3 

for some invertible matrix P. If desired, the matrix P can be found using the method shown in Example 1 of this section. 



EXAMPLE 4 A Diagonalizable Matrix 



From Theorem 7.1.1, the eigenvalues of a triangular matrix are the entries on its main diagonal. Thus, a triangular matrix with 
distinct entries on the main diagonal is diagonalizable. For example, 

-12 4 0" 

3 1 7 

5 8 

0-2 



A = 



is a diagonalizable matrix. 



EXAMPLE 5 Repeated Eigenvalues and Diagonalizability 



It's important to note that Theorem 7.2.3 says only that if a. matrix has all distinct eigenvalues (whether real or complex), then it 
is diagonalizable; in other words, only matrices with repeated eigenvalues might be nondiagonalizable. For example, the 3 X 3 
identity matrix 



"1 





0" 





1 











1 



"1 


1 


0" 





1 


1 








1 



h = 



has repeated eigenvalues Ai 2 3 = 1 but is diagonalizable since any nonzero vector in ^ 3 is an eigenvector of the 3 X 3 identity 
matrix (verify), and so, in particular, we can find three linearly independent eigenvectors. The matrix 



j 3 = 



also has repeated eigenvalues Ai 2 3 = 1> but solving for its eigenvectors leads to the system 

"0 1 0" 
(A/-J 3 )x= 1 x = 


the solution of which is ;q =t, K2 = 0, X2 = 0- Thus every eigenvector of j 3 is a multiple of 

1' 





which means that the eigenspace has dimension 1 and that j 3 is nondiagonalizable. 

Matrices that look like the identity matrix except that the diagonal immediately above the main diagonal also has l's on it, such 
as j^ or are known as Jordan block matrices and are the canonical examples of nondiagonalizable matrices. The Jordan block 
matrix j has an eigenspace of dimension 1 that is the span of ej. These matrices appear as submatrices in the Jordan 
decomposition, a sort of near-diagonalization for nondiagonalizable matrices. 



^2 = 



1 1 
1 



Geometric and Algebraic Multiplicity 

We see from Example 5 that Theorem 7.2.3 does not completely settle the diagonalization problem, since it is possible for an 
n x n matrix A to be diagonalizable without having n distinct eigenvalues. We also saw this in Example 1, where the given 3 x 3 
matrix had only two distinct eigenvalues and yet was diagonalizable. What really matter for diagonalizability are the dimensions 
of the eigenspaces — those dimensions must add up to n in order for an n x n matrix to be diagonalizable. Examples Example 1 
and Example 2 illustrate this; the matrices in those examples have the same characteristic equation and the same eigenvalues, 
but the matrix in Example 1 is diagonalizable because the dimensions of the eigenspaces add to 3, and the matrix in Example 2 
is not diagonalizable because the dimensions only add to 2. The 3 X 3 matrices in Example 5 also have the same characteristic 
polynomial (A — l) 3 and hence the same eigenvalues, but the first matrix has a single eigenspace of dimension 3 and so is 
diagonalizable, whereas the second matrix has a single eigenspace of dimension 1 and so is not diagonalizable. 

A full excursion into the study of diagonalizability is left for more advanced courses, but we shall touch on one theorem that is 
important to a fuller understanding of diagonalizability. It can be proved that if ^J is an eigenvalue of A, then the dimension of 
the eigenspace corresponding to Aq cannot exceed the number of times that A — Aq appears as a factor in the characteristic 
polynomial of A. For example, in Examples Example 1 and Example 2 the characteristic polynomial is 

(A-l)(A-2) 2 



Thus the eigenspace corresponding to A = 1 is at most (hence exactly) one-dimensional, and the eigenspace corresponding to 
A = 2 is at most two-dimensional. In Example 1 the eigenspace corresponding to A = 2 actually had dimension 2, resulting in 
diagonalizability,but in Example 2 that eigenspace had only dimension 1, resulting in nondiagonalizability. 

There is some terminology that is related to these ideas. If Ag is an eigenvalue of an n x n matrix A, then the dimension of the 
eigenspace corresponding to Aq is called the geometric multiplicity of Aq> an d the number of times that \ _ Ag appears as a factor 
in the characteristic polynomial of A is called the algebraic multiplicity of A. The following theorem, which we state without 
proof, summarizes the preceding discussion. 



THEOREM 7.2.4 



Geometric and Algebraic Multiplicity 

If A is a square matrix, then 

(a) For every eigenvalue of A, the geometric multiplicity is less than or equal to the algebraic multiplicity. 



(b) A is diagonalizable if and only if for every eigenvalue, the geometric multiplicity is equal to the algebraic 
multiplicity. 



Computing Powers of a Matrix 

There are numerous problems in applied mathematics that require the computation of high powers of a square matrix. We shall 
conclude this section by showing how diagonalization can be used to simplify such computations for diagonalizable matrices. 

If A is an n x n matrix and P is an invertible matrix, then 

(P~ X AP) = P~ 1 APP~ 1 AP = P~ 1 AIAP = P~ 1 A 2 P 
More generally, for any positive integer k, 



j-l 



)-l Aki 



(P~ l AP) =P~ l A*P 
It follows from this equation that if A is diagonalizable, and p~^j{p — £ is a diagonal matrix, then 



)-l A k 



-1 



Solving this equation for ^ yields 



P~ l A K P=(P~ l AP) =D 



A k = PD k P~ l 



(8) 



(9) 



(10) 



This last equation expresses the kth power of A in terms of the kth power of the diagonal matrix D. But d& is easy to compute, 
for if 



D = 



di - 
d 2 - 

- d„ 



then D K = 



4 - 
^ *" ° 



~ dt 



EXAMPLE 6 Power of a Matrix 



Use 10 to find j4 13 , where 



A = 



0-2 

1 2 1 
1 3 



Solution 



We showed in Example 1 that the matrix A is diagonalized by 

-1 

1 

1 



p = 



-2 
1 
1 



and that 



D = P~ l AP = 



"2 





0" 





2 











1 



Thus, from 10, 



A 12 = PD 12 P~ 1 = 



■1 

1 

1 



2 13 



2 13 



13 



8190 -16382 

8191 8192 8191 
8191 16383 



1 


2" 


1 1 


1 


-1 


-1 



(11) 



Remark With the method in the preceding example, most of the work is in diagonalizing A. Once that work is done, it can be 
used to compute any power of A. Thus, to compute j± ww we need only change the exponents from 13 to 1000 in 11. 



Exercise Set 7.2 



® 



Click here for Just Ask! 



1. 



Let A be a 6 x 6 matrix with characteristic equation A (A — 1 ) (A — 2) = 0. What are the possible dimensions for eigenspaces 
ofA? 



Let 



A = 



4 1 
2 3 2 
1 4 



(a) Find the eigenvalues of A. 



(b) For each eigenvalue A, find the rank of the matrix \[ — A- 



(c) Is A diagonalizable? Justify your conclusion. 



In Exercises 3-7 use the method of Exercise 2 to determine whether the matrix is diagonalizable. 



3. 



2 
1 2 



2 -3 
1 -1 



5. 



"3 





0" 





2 








1 


2 



-1 

-1 3 

-4 13 

2—10 

2 1 

3 





1 



-1 



In Exercises 8-1 1 find a matrix P that diagonalizes A, and determine P~^AP ■ 



A = 



-14 12 
-20 17 



9. 



A = 



1 
6 -1 



10. A = 



"1 





0" 





1 


1 





1 


1 



11. A = 



2 0-2 
3 
3 



In Exercises 12-17 find the geometric and algebraic multiplicity of each eigenvalue, and determine whether A is diagonalizable. 
If so, find a matrix P that diagonalizes A, and determine p~^j\p ■ 



12. A = 



19 -9 -6 
25 -11 -9 
17 -9-4 



13. A = 



-1 
-3 
-3 



-2 

3 



14. ,4 = 



"5 





0" 


1 


5 








1 


5 



15. A = 



"0 





0" 











3 





1 



16. 



A = 









o" 


-2 











3 








1 


3 



17. 



18. 



A = 





■2 5 

3 







-5 



3 



Use the method of Example 6 to compute ^10, where 



^4 = 



1 
-1 2 



19. 



Use the method of Example 6 to compute A n » where 



j4 = 



-1 7 


-1" 


1 





15 


-2 



20. 



In each part, compute the stated power of 



A = 


1 




-2 

-1 




8 



-1 


(a) .4 1000 






(b) .4- 1000 






(c) ^2301 






(d) A~ 2m 









21. 



Find A n if n is a positive integer and 



A = 



3-1 

-1 2-1 

0-1 3 



22. 



Let 



Show that: 



A = 



a b 
c d 



(a) A is diagonalizable if (a - d) 2 4- 4bc > 0. 



(b) A is not diagonalizable if (a — d) 4- 4bc < 0. 



Hint See Exercise 17 of Section 7.1. 



23. 



In the case where the matrix A in Exercise 22 is diagonalizable, find a matrix P that diagonalizes A. 
///wf See Exercise 18 of Section 7.1. 



24. 



Prove that if A is a diagonalizable matrix, then the rank of A is the number of nonzero eigenvalues of A. 



Discussion 

Disc OVGTV Indicate whether each statement is always true or sometimes false. Justify your answer by giving 



25. a logical argument or a counterexample. 



(a) A square matrix with linearly independent column vectors is diagonalizable. 



(b) If A is diagonalizable, then there is a unique matrix P such that p~^j\p is a diagonal 
matrix. 



(c) If vi, \ r 2, and V3 come from different eigenspaces of A, then it is impossible to express V3 
as a linear combination of vi and V2- 



(d) If A is diagonalizable and invertible, then ^4 _1 is diagonalizable. 



(e) If A is diagonalizable, then ^ is diagonalizable. 



Suppose that the characteristic polynomial of some matrix A is found to be 
26- p (A) = (A — 1 ) (A — 3) 2 (A — 4) 3 . In each part, answer the question and explain your reasoning. 



(a) What can you say about the dimensions of the eigenspaces of A? 



(b) What can you say about the dimensions of the eigenspaces if you know that A is 
diagonalizable? 



(c) If (y 1? y 2? v 3 ) is a linearly independent set of eigenvectors of A all of which correspond 
to the same eigenvalue of A, what can you say about the eigenvalue? 



27. (For Readers Who Have Studied Calculus) If A\> ^ * * *' ^fc ... is an infinite sequence of 
h x tf matrices, then the sequence is said to converge to the n x n matrix A if the entries in the fth 
row and jth column of the sequence converge to the entry in the /th row and jth column of A for 

all i and j. In that case we call A the limit of the sequence and write limjt » H-oo-^fc — -A. The 

algebraic properties of such limits mirror those of numerical limits. Thus, for example, if P is an 
invertible n x n matrix whose entries do not depend on k, then 1™ A = A if and only if 



lim P~ l A k P = P~ l AP. 
k — * +00 



k — * +00 



(a) Suppose that A is an n x n diagonalizable matrix. Under what conditions on the 

eigenvalues of A will the sequence A, ^ 2 , . . ., ^4^, . . . converge? Explain your reasoning. 



(b) What is the limit when your conditions are satisfied? 



28. (For Readers Who Have Studied Calculus) If A\ + ^2 H I- ^k + m "* s an i n fi n it e series of 

n x n matrices, then the series is said to converge if its sequence of partial sums converges to 
some limit A in the sense defined in Exercise 27. In that case we call A the sum of the series and 
write A = Ai+A 2 + - + A k + ■-- 

(a) From calculus, under what conditions on x does the geometric series 

1 +x + x 2 + - + ** + ■■■ 

converge? What is the sum? 

(b) Judging on the basis of Exercise 27, under what conditions on the eigenvalues of A would 

you expect the geometric matrix series / |_ J 4 + J 4 2 _^ 1_ ^ + „. to converge? Explain 

your reasoning. 



(c) What is the sum of the series when it converges? 

Show that the Jordan block matrix j n has A = 1 as its only eigenvalue and that the corresponding 
29- eigenspace is span { e 1 } • 
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7 a In this section we shall be concerned with the problem of finding an 

■O ortho normal basis for R n with the Euclidean inner product consisting of 

ORTHOGONAL eigenvectors of a given nxn matrix A. Our earlier work on symmetric 

DIAGONALI Z ATI O N matrices and orthogonal matrices will play an important role in the 

discussion that follows. 



Orthogonal Diagonalization Problem 

As in the preceding section, we begin by stating two problems. Our goal is to show that the problems are equivalent. 

The Orthonormal Eigenvector Problem Given an n x n matrix A, does there exist an orthonormal basis for R n with the 
Euclidean inner product that consists of eigenvectors of the matrix A? 

The Orthogonal Diagonalization Problem (Matrix Form) Given an n x n matrix A, does there exist an orthogonal matrix P 
such that the matrix p~ l AP = P T AP is diagonal? If there is such a matrix, then A is said to be orthogonally diagonalizable 
and P is said to orthogonally diagonalize A. 

For the latter problem, we have two questions to consider: 
Which matrices are orthogonally diagonalizable? 

How do we find an orthogonal matrix to carry out the diagonalization? 

With regard to the first question, we note that there is no hope of orthogonally diagonalizing a matrix A unless A is 
symmetric (that is, ^4 = ^4^). To see why this is so, suppose that 



P T AP = D 



(1) 



where P is an orthogonal matrix and D is a diagonal matrix. Since P is orthogonal, pp T = p T p = /, so it follows that 1 can 
be written as 

A = PDP T (2 ) 

Since D is a diagonal matrix, we have D = D T - Therefore, transposing both sides of 2 yields 

A T =(PDP T ) =(P T ) D T P T = PDP T = A 



so A must be symmetric. 

Conditions for Orthogonal Diagonalizability 

The following theorem shows that every symmetric matrix is, in fact, orthogonally diagonalizable. In this theorem, and for 
the remainder of this section, orthogonal will mean orthogonal with respect to the Euclidean inner product on R n . 



THEOREM 7.3.1 



If A is an n x n matrix, then the following are equivalent. 

(a) A is orthogonally diagonalizable. 

(b) A has an orthonormal set ofn eigenvectors. 

(c) A is symmetric. 



Proof (a) ^> [b) Since A is orthogonally diagonalizable, there is an orthogonal matrix P such that P~ l AP is diagonal. As 
shown in the proof of Theorem 7.2.1, the n column vectors of P are eigenvectors of A. Since P is orthogonal, these column 
vectors are orthonormal (see Theorem 6.6.1), so A has n orthonormal eigenvectors. 

(6) ^ {a) Assume that A has an orthonormal set of n eigenvectors (j, 1 ? p 2 , . . ., p H } • As shown in the proof of Theorem 7.2.1, 
the matrix P with these eigenvectors as columns diagonalizes A. Since these eigenvectors are orthonormal, P is orthogonal 
and thus orthogonally diagonalizes A. 

[a] => (c) In the proof that ( a ) > (&), we showed that an orthogonally diagonalizable n x n matrix A is orthogonally 
diagonalized by an n x n matrix P whose columns form an orthonormal set of eigenvectors of A. Let D be the diagonal 
matrix 



Thus 



since P is orthogonal. Therefore, 



D = P~ l AP 



A = PDP~ l or A = PDP T 



A T =(PDP T ) =PD T P T = PDP T = A 



which shows that A is symmetric. 

{c) => (a) The proof of this part is beyond the scope of this text and will be omitted. 

I 

Note in particular that every symmetric matrix is diagonalizable. 

Symmetric Matrices 

Our next goal is to devise a procedure for orthogonally diagonalizing a symmetric matrix, but before we can do so, we need 
a critical theorem about eigenvalues and eigenvectors of symmetric matrices. 

THEOREM 7.3.2 



If A is a symmetric matrix, then 



(a) The eigenvalues of A are all real numbers. 

(b) Eigenvectors from different eigenspaces are orthogonal. 



Proof (a) The proof of part (a), which requires results about complex vector spaces, is discussed in Section 10.6. 



Proof (b) Let w\ and V2 be eigenvectors corresponding to distinct eigenvalues X\ and A2 °f the matrix A. We want to show 
that Vl . v 2 = 0- The proof of this involves the trick of starting with the expression Av\ . V2 . It follows from Formula 8 of 
Section 4.1 and the symmetry of A that 

Avi - V2 = vi ■ A V2 = vi ■ ^4v2 (3) 

But vi is an eigenvector of A corresponding to \ u and v 2 is an eigenvector of A corresponding to \ 2 , 
so 3 yields the relationship 

^l v l -V2 = vi -A 2 V2 

which can be rewritten as 

(Ai-A 2 )(vi-V2) = (4) 

But Ai - A 2 * 0, since Ai and \ 2 were assumed distinct. Thus it follows from 4 that Vl . V2 = 0- 



Remark We remind the reader that we have assumed to this point that all of our matrices have real entries. Indeed, we shall 
see in Chapter 10 that part (a) of Theorem 7.3.2 is false for matrices with complex entries. 

Diagonalization of Symmetric Matrices 

As a consequence of the preceding theorem we obtain the following procedure for orthogonally diagonalizing a symmetric 
matrix. 

Step 1. Find a basis for each eigenspace of A. 

Step 2. Apply the Gram-Schmidt process to each of these bases to obtain an orthonormal basis for each eigenspace. 

Step 3. Form the matrix P whose columns are the basis vectors constructed in Step 2; this matrix orthogonally 
diagonalizes A. 

The justification of this procedure should be clear: Theorem 7.3.2 ensures that eigenvectors from different eigenspaces are 
orthogonal, whereas the application of the Gram-Schmidt process ensures that the eigenvectors obtained within the same 
eigenspace are orthonormal. Therefore, the entire set of eigenvectors obtained by this procedure is orthonormal. 



EXAMPLE 1 An Orthogonal Matrix PThat Diagonalizes a Matrix A 



Find an orthogonal matrix P that diagonalizes 



A = 



4 2 2 
2 4 2 
2 2 4 



Solution 

The characteristic equation of A is 

det(A7 — A) = det 



A-4 -2 -2 
-2 A-4 -2 
_2 -2 A-4 



= (A-2) J (A-8) = 



Thus the eigenvalues of A are Ai 2 = 2 and ^ 3 — g. By the method used in Example 5 of Section 7.1, it can be shown that 



111 = 



-1 
1 




and 112 = 



-1 

1 



(5) 



form a basis for the eigenspace corresponding to \ — 2- Applying the Gram-Schmidt process to {uj, 112} yields the 
following orthonormal eigenvectors (verify): 

-If ^6 

and \ r 2 = 



vi = 



-1//2 

1//2 




-1//6 
2t{l 



(6) 



The eigenspace corresponding to \ — 3 has 



u 3 = 



as a basis. Applying the Gram-Schmidt process to { U3 } yields 



V 3 : 



Finally, using vi, \'2> and V3 as column vectors, we obtain 



1//3 
1//3 
1//3 



-1//2 — 1 / |/e" 1//3 

2° 1//2 -1//6 1//3 

2//e 1//3 

which orthogonally diagonalizes A. (As a check, the reader may wish to verify that p T AP is a diagonal matrix.) 



Exercise Set 7.3 



@ 



Click here for Just Ask! 



Find the characteristic equation of the given symmetric matrix, and then by inspection determine the dimensions of the 
!• eigenspaces. 



(a) 



1 2 

2 4 



(b) 



1 

-4 

2 



-4 

1 

-2 



2 
-2 
-2 



(c) 



"1 


1 


r 


1 


1 


i 


1 


1 


i 



(d) 



4 2 2 
2 4 2 
2 2 4 



(e) 



4 


4 





o" 


4 


4 

































(f) 



2 -1 

1 2 










o" 








2 


-1 


-1 


2 



In Exercises 2-9 find a matrix P that orthogonally diagonalizes A, and determine p~^AP- 



A = 



3 1 

1 3 



3. ,4 = 



6 2/3 
2/1 7 



4. 



,4 = 



6 -2 
-2 3 



5. A = 



-2 

-3 

-36 



-36 



-23 



6. A = 



"1 


1 


0" 


1 


1 















7. A = 



2 -1 
-1 2 
-1 -1 



A = 



3 


1 





o" 


1 


3 

































A = 



-1 24 

24 7 










o" 








-7 


24 


24 


7 



10. 



Assuming that £ ^ Q, find a matrix that orthogonally diagonalizes 



11. 



Prove that if A is any mxn matrix, then A T A h as an orthonormal set of n eigenvectors. 



12. 



(a) Show that if v is any n x 1 matrix and / is the w x n identity matrix, then / _ w r is orthogonally diagonalizable. 



(b) Find a matrix P that orthogonally diagonalizes / _ w T if 



v = 



13. 



Use the result in Exercise 17 of Section 7.1 to prove Theorem 13.2a for 2 x 2 symmetric matrices. 



Discussion 

Discovery Indicate whether each statement is always true or sometimes false. Justify your answer by 



14. giving a logical argument or a counterexample. 



(a) If A is a square matrix, then ^^ m & A T A are orthogonally diagonalizable. 



(b) If vi and V2 are eigenvectors from distinct eigenspaces of a symmetric matrix, then 

||vi I v 2 ll 2 = llvill 2 + ||vill 2 - 



(c) An orthogonal matrix is orthogonally diagonalizable. 

(d) If A is an invertible orthogonally diagonalizable matrix, then j[~ l is orthogonally 
diagonalizable. 



Does there exist a 3 x 3 symmetric matrix with eigenvalues \\= — 1> A2 = 3» A3 = 7 an d 
15« corresponding eigenvectors 








1 









1 


? 





? 


1 


? 


-1 









1 





If so, find such a matrix; if not, explain why not. 



Is the converse of Theorem 13.2b true? 



16. 
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Supplementary Exercises 



1. 



(a) Show that if < 9 < -k, then 



A = 



cosO — smO 
sinO cos9 



has no eigenvalues and consequently no eigenvectors. 
(b) Give a geometric explanation of the result in part (a). 



2. 



Find the eigenvalues of 



A = 



1 
1 

k 3 -3k 2 3k 



(a) Show that if D is a diagonal matrix with nonnegative entries on the main diagonal, then there is a matrix S such 
that s 2 = D- 



(b) Show that if A is a diagonalizable matrix with nonnegative eigenvalues, then there is a matrix S such that S 2 — A- 



(c) Find a matrix S such that g2 _ ^ if 



A = 



"1 3 


f 


4 


5 





9 



Prove: If A is a square matrix, then A and ^ have the same characteristic polynomial. 



Prove: If A is a square matrix and p(X) = det(A/ — j4) is the characteristic polynomial of A, then the coefficient of A" -1 
in p (A) is the negative of the trace of A. 



6. 



Prove: If £ gc 0, then 



,4 = 



a 



is not diagonalizable. 



In advanced linear algebra, one proves the Cayley-Hamilton Theorem, which states that a square matrix A satisfies its 
7. characteristic equation; that is, if 



c + ciA + c 2 A 2 + - + c H _iA" 1 +X" = 



is the characteristic equation of A, then 
Verify this result for 



c Q I + ciA + c 2 A 2 + - + c n -iA" 1 +A" = 



< a >ii = 



3 6 
1 2 



(b) 



,4 = 






1 0" 





1 


1 


-3 3 



Exercises 8-10 use the Cayley-Hamilton Theorem, stated in Exercise 7. 



(a) Use Exercise 16 of Section 7.1 to prove the Cayley-Hamilton Theorem for arbitrary 2x2 matrices. 



(b) Prove the Cayley-Hamilton Theorem for M x m diagonalizable matrices. 



The Cayley-Hamilton Theorem provides a method for calculating powers of a matrix. For example, if A is a 2 x 2 matrix 
9. with characteristic equation 

then CQ ; , C{ a f ,4 2 = 0, so 

^4 2 = -ciA-cqI 
Multiplying through by A yields a 3 — _ ci ^4 2 — cnA which expresses A 3 in terms of A 2 an d A, and multiplying through 
by A 2 yields ^ 4 — _ c - a 3 — cnA 2 > which expresses ^ 4 in terms of A 3 an d A 2 - Continuing in this way, we can calculate 
successive powers of A simply by expressing them in terms of lower powers. Use this procedure to calculate A 2 ? A 3 ? A 4 ? 
and A 5 f° r 



,4 = 



3 6 
1 2 



10. 



Use the method of the preceding exercise to calculate A 3 an d A A for 



,4 = 






1 0" 





1 


1 


-3 3 



11. 



Find the eigenvalues of the matrix 



A = 



ci 


<=2 ■ 


c n 


Cl 


<=2 " 


" c yi 


CI 


C2 ■ 


c n 



12. 



(a) It was shown in Exercise 15 of Section 7.1 that if A is an M x h matrix, then the coefficient of A" in the 

characteristic polynomial of A is 1. (A polynomial with this property is called monic.) Show that the matrix 



0-0 -cq 
10 0-0 -ci 
10-0 -c 2 







1 



"Ch-1 



has characteristic polynomial P (X) = c + c\X -\ h c H _iA" _1 4- A"- This shows that every 

monic polynomial is the characteristic polynomial of some matrix. The matrix in this 
example is called the companion matrix of p (\). 

Hint Evaluate all determinants in the problem by adding a multiple of the second row to the first to introduce a 
zero at the top of the first column, and then expanding by cofactors along the first column. 

(b) Find a matrix with characteristic polynomial p (A) = 1— 2A -h A 2 -h 3A 3 + A^. 



A square matrix A is called nilpotent if ^4" = for some positive integer n. What can you say about the eigenvalues of a 
13. nilpotent matrix? 



14. 



Prove: If A is an n x n matrix and n is odd, then A has at least one real eigenvalue. 



15. 



Find a 3 x 3 matrix A that has eigenvalues A = 0, 1, and -1 with corresponding eigenvectors 



0" 




f 




"0" 


1 


? 


-1 


? 


1 


-1 




1 




1 



16. 



respectively. 

Suppose that a 4 x 4 matrix A has eigenvalues X\ = 1> A2 = — 2> A3 = 3> an d A4 = — 3- 



(a) Use the method of Exercise 14 of Section 7.1 to find det(y4)- 



(b) Use Exercise 5 above to find tr(A)- 



17. 



Let A be a square matrix such that j^ _ ^. What can you say about the eigenvalues of A? 
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ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Section 7.1 

Tl. (Characteristic Polynomial) Some technology utilities have a specific command for finding characteristic polynomials, and 
in others you must use the determinant function to compute <Jet(A/ — A)- Read your documentation to determine which 
method you must use, and then use your utility to find ^(A) = for the matrix in Example 2. 



T2. (Solving the Characteristic Equation) Depending on the particular characteristic polynomial, your technology utility may 
or may not be successful in solving the characteristic equation for the eigenvalues. See if your utility can find the eigenvalues 
in Example 2 by solving the characteristic equation p(\) — 0- 



T3. 



(a) Read the statement of the Cayley-Hamilton Theorem in Supplementary Exercise 7 of this chapter, and then use your 
technology utility to do that exercise. 

(b) If you are working with a CAS, use it to prove the Cayley-Hamilton Theorem for 3 x 3 matrices. 



T4. (Eigenvalues) Some technology utilities have specific commands for finding the eigenvalues of a matrix directly (though the 
procedure may not be successful in all cases). If your utility has this capability, read the documentation and then compute the 
eigenvalues in Example 2 directly. 



T5. (Eigenvectors) One way to use a technology utility to find eigenvectors corresponding to an eigenvalue i is to solve the 
linear system (\I — A)x=0- Another way is to use a command for finding a basis for the nullspace of XI — A (if available). 
However, some utilities have specific commands for finding eigenvectors. Read your documentation, and then explore 
various procedures for finding the eigenvectors in Examples 5 and 6. 

Section 7.2 

Tl. (Diagonalization) Some technology utilities have specific commands for diagonalizing a matrix. If your utility has this 
capability, read the documentation and then use your utility to perform the computations in Example 2. 

Note Your software may or may not produce the eigenvalues of A and the columns of P in the same order as the example. 



Section 7.3 

Tl. (Orthogonal Diagonalization) Use your technology utility to check the computations in Example 1. 
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CHAPTER 



Linear Transformations 



INTRODUCTION: In Sections 4.2 and 4.3 we studied linear transformations from R n to R m . In this chapter we shall 
define and study linear transformations from an arbitrary vector space \/to another arbitrary vector space l/l/. The results we 
obtain here have important applications in physics, engineering, and various branches of mathematics. 
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8.1 

GENERAL LINEAR 
TRANSFORMATIONS 



In Section 4.2 we defined linear transformations from R n to R m . In this 
section we shall extend this idea by defining the more general concept of a 
linear transformation from one vector space to another. 



Definitions and Terminology 

Recall that a linear transformation from R" to R™ was first defined as a function 

7*0 1, X2, -■-, x„) = (wi, w 2 , ..., w ffl ) 

for which the equations relating w\,W2,---,w 7fl and x\,X2,---,x n dxt linear. Subsequently, we showed that a transformation 
T:R n — » R m is linear if and only if the two relationships 

T(u + v) = T(u) + 7(v) and T{c\\) = cT{\\) 

hold for all vectors u and v in R" and every scalar c (see Theorem 4.3.2). We shall use these properties as the starting point 
for general linear transformations. 



DEFINITION 



If 7; jf * ff is a function from a vector space V into a vector space W, then 7 is called a linear transformation from V 

to W if, for all vectors u and v in V and all scalars c, 



(a) T(u + v) = T(u)+T(v) 



(b) T(^u)=cT(u) 



In the special case where y = W* the linear transformation T: V * V i s called a linear operator on V. 



EXAMPLE 1 Matrix Transformations 



Because the preceding definition of a linear transformation was based on Theorem 4.3.2, linear transformations from R n to 
R m , as defined in Section 4.2, are linear transformations under this more general definition as well. We shall call linear 
transformations from R n to R m matrix transformations, since they can be carried out by matrix multiplication. 



EXAMPLE 2 The Zero Transformation 



Let V and Wbt any two vector spaces. The mapping 7* : Jf — > {^ such that T(y) = for every v in V is a linear 
transformation called the zero transformation. To see that Tis linear, observe that 

T(u + v) = 0, T(u) = 0, T(v) = 0, and 7(£u) = 
Therefore, 

T(u + v) = 7(u) + 7(v) and F(Au) = Jt7*(u) 



EXAMPLE 3 The Identity Operator 



Let V be any vector space. The mapping / : Y * V defined by /( v ) = v is called the identity operator on V. The verification 

that / is linear is left for the reader. 



EXAMPLE 4 Dilation and Contraction Operators 



Let V be any vector space and k any fixed scalar. We leave it as an exercise to check that the function T: V * V defined by 

T(y)=kv 

is a linear operator on V. This linear operator is called a dilation of V with factor k if £ > 1 and is called a contraction of V 
with factor k if < jt < 1- Geometrically, the dilation "stretches" each vector in V by a factor of fc, and the contraction of V 
"compresses" each vector by a factor of k (Figure 8.1.1). 




V 

(4) Dilation 




{b) Contraction of V 
Figure 8.1.1 



EXAMPLE 5 Orthogonal Projections 



In Section 6.4 we defined the orthogonal projection of R m onto a subspace W. [See Formula 6 and the definition preceding it 
in that section.] Orthogonal projections can also be defined in general inner product spaces as follows: Suppose that Wis a 
finite-dimensional subspace of an inner product space V\ then the orthogonal projection of V onto Wis the transformation 
defined by 

T(v)=proj^v 
(Figure 8.1.2). It follows from Theorem 6.3.5 that if 

is any orthonormal basis for W, then T(y) is given by the formula 

7*(v) = pro] ft/v = (v, wi Ifwi 4- (v, w 2 \w 2 + - + (v, w^w r 
The proof that T is a linear transformation follows from properties of the inner product. For example, 

7*(u + v) = (u + v, wi Jwi 4- (u 4- v, W2 }W2 H h (u -h v, w> Jw> 

= (u, wi Jwi 4- (u, W2}W2 4- h (u, w> Jw r 

I (v> wi Jwi + (v, W2}W2 + - + (v, w> }w> 

= r(u) + ro) 



Similarly, T(kii) = kT(\i) • 




Figure 8.1.2 



The orthogonal projection of Vonto W. 



EXAMPLE 6 Computing an Orthogonal Projection 



As a special case of the preceding example, let y — p} have the Euclidean inner product. The vectors W \ = (1, 0, 0) and 
W2 = (0, 1, 0) form an orthonormal basis for the ;ry-plane. Thus, if y = (x, y, z) is any vector in j? 3 , the orthogonal 
projection of j^ onto the xy -plane is given by 

7*0) = (v, wi Jwi I (v> w 2 }w2 
= *(1,D,Q) I y(0, 1,0) 
= (^7,0) 
(See Figure 8.1.3.) 



<.i.r..:) 




Figure 8.1.3 



The orthogonal projection of J? 3 onto the xy -plane. 



EXAMPLE 7 A Linear Transformation from a Space 1/ to R n 



Let S= { Wl? W2? ___ r w H } be a basis for an ^-dimensional vector space V, and let 

be the coordinate vector relative to S of a vector v in V ; thus 

v = k\w\ + k 2 w 2 H h kyjw n 

Define T\ V — * R* 1 to be the function that maps v into its coordinate vector relative to S — that is, 

T(v) = (v) s =(k h k 2 k») 

The function T is a linear transformation. To see that this is so, suppose that u and v are vectors in V and that 

u = c\w\ 4- c 2 w 2 + ■■■ + c H w H an ^ v = dfiwi -h d 2 vt 2 H h<sT n w n 

Thus 

But 

so 

Therefore, 



(P)s= Ol, ^2, ..., c„) and (v) # = 0?i, <s?2- — - ^h) 

u + v=(cH di)vr\ I (C2 fd?2)w2H h(c H +a? M )w H 

£u = (fei)wi 4- C^2) w 2 + — I" C^ c h) w h 

(u + v) s = (ci 4- d u c 2 4 d 2 , ---, c H 4- df„) 
(jfcu)^= (kci,kc 2 ,---,kcn) 



(u 4- v) ^ = (u) ^ 4 (v) ^ and (An) 5 = k(u) s 
Expressing these equations in terms of T, we obtain 

T(u + v) = T(u) + T(v) and r(Au) = kT(u) 
which shows that T is a linear transformation. 



Remark The computations in the preceding example could just as well have been performed using coordinate vectors in 
column form; that is, 

[u + Y]tf= [m] s + [v] s and [An] j=jt[u] $ 



EXAMPLE 8 A Linear Transformation from p n to p n+1 



Let p = p ( x ) = cn |_ C]X + - 4 Cy>x n be a polynomial in p n , and define the function T:P n — > P H+ i by 

?XP) = ^(^ CO ) = xp (x) = c x 4- c i * 2 4- - 4- c n x" +l 
The function Tis a linear transformation, since for any scalar k and any polynomials pi and \>2 in p n we have 

niM fl»2)=^lW I />2(*))=*(/>lOO I *2C0) 
= ^l(^) +*P2(x) = ^(Pl) 4- T(y 2 ) 
and 

T(kp) = T(kp(x)) =x(kp(x))=k(xp(x))=kT( ? ) 
(Compare this to Exercise 4 of Section 4.4.) 



EXAMPLE 9 A Linear Operator on p n 



Let p = p(x) = eg + c\x 4- - I c H * M be a polynomial in ^ H , and let a and Z? be any scalars. We leave it as an exercise to 
show that the function T defined by 

T(p) = T(p(x))=p(ax I b)=CQ+ci(ax+b)+- + c n (ax + b)" 

is a linear operator. For example, if ax + b = 3x — 5, then T.P2 * P2 wou ld be the linear operator given by the formula 

2 2 

Tfcn + C]X I cyx ) =cn4 C] (3* - 5) + ^(3* — 5) 



EXAMPLE 10 A Linear Transformation Using an Inner Product 



Let V be an inner product space, and let vq be any fixed vector in V. Let T: V * R be the transformation that maps a vector 

v into its inner product with yg — that is, 

7(v) = (v,v } 
From the properties of an inner product, 

7(u + v) = (u + v, v } = (u, vo) + (v ? v ) = 7(u) + T(y) 

and 

r(iu) = (iu, vq} = jt(u, vq} = kT(\i) 
so T is a linear transformation. 



EXAMPLE 11 A Linear Transformation from C 1 { - 00 , 00 ) to F( — 00 , 00 ) 



Calculus Required 



Let V = C ( — do , do ) be the vector space of functions with continuous first derivatives on ( _ oo , m), and let 

W = F(— xj , oo ) be the vector space of all real- valued functions defined on ( _ oo , oo ) • Let £); f^ * pp be the 

transformation that maps a function f = f (j) into its derivative — that is, 

From the properties of differentiation, we have 

D(f + g) = D(f ) + D(g) and D(H) = kD(I) 
Thus, D is a linear transformation. 



EXAMPLE 12 A Linear Transformation from C( - oo , oo ) to C 1 { - oo , oo ) 



Calculus Required 



Let y = C( — oo , oo ) be the vector space of continuous functions on ( _ oo 9 oo ), and let W = C 1 ( — oo , oo ) be the 
vector space of functions with continuous first derivatives on ( _ oo , oo )• Let J : [/" * ff be the transformation that maps 

f = y (^) into the integral / f(t)dt- For example, if f _ x 2 , then 

Jo 



r* ,3 \ x 3 



j<n=£t 2 dt=£-\ =£- 



3 In 3 



From the properties of integration, we have 

Jo jo Jo 

Jfrf ) = / c/ (0^ = * / / CO^ = cJ(t ) 
Jo Jo 



so 7 is a linear transformation. 



EXAMPLE 13 A Transformation That Is Not Linear 



Let 7 : M nri * R be the transformation that maps an n x n matrix into its determinant: 

T(A) = detfJ) 

If « > 1> then this transformation does not satisfy either of the properties required of a linear transformation. For example, we 
saw in Example 1 of Section 2.3 that 

det(^i I A 2 ) * det(j4i) + det^) 
in general. Moreover, det(ej4) = c n det(A), so 

det(cA) ±cdet(A) 
in general. Thus T is not a linear transformation. 

Properties of Linear Transformations 

If 7; V * W is a linear transformation, then for any vectors y\ and V2 in V and any scalars c \ and ^2, we have 



7Xcivi+c 2 V2) = r(civi) I ^(c2V2)=ci7(vi) I c 2 T(y 2 ) 
and, more generally, if vi , \'2,..-, v H are vectors in Vandc\, c 2 , • ••, c H are scalars, then 

7*(<; i vi I c 2 v 2 -\ Vc yi Xy i )=c\T(x\)+c 2 T(y 2 )+-» + c Yi T(\ n ) 

Formula 1 is sometimes described by saying that linear transformations preserve linear combinations. 
The following theorem lists three basic properties that are common to all linear transformations. 

THEOREM 8.1.1 







Wis a 


linear transformation, 


then 
and w 


in V 


IfT.V 


> 


(a) 
(b) 
(c) 




7(0) 


= 






T{- 


v) = 


— T(y) f or °tt v in V 








T(y 


- W ) = 


= T(v)-T(w) for ally 









Proof Let v be any vector in V. Since Qv = 0> we have 

which proves (a). Also, 

T( - v) = T(( - l)v) - ( - l)T(v) = - T(v) 

which proves (£>). Finally, v -w = v i (-l)w; thus 

T(y-w)=T(? I (-l)w) 

= r(v) + (-l)r(w) 
= T(v)-T(w) 

which proves (c). 

In words, part (a) of the preceding theorem states that a linear transformation maps to 0. This property is useful for 
identifying transformations that are not linear. For example, if xq is a fixed nonzero vector in j? 2 , then the transformation 

T(x) = x + x n 
has the geometric effect of translating each point x in a direction parallel to xq through a distance of || X q|| (Figure 8.1.4). 
This cannot be a linear transformation, since 7(0) = xrj> so T does not map to 0. 




Figure 8.1.4 



7*(x) = x + xq translates each point x along a line parallel to xq through a distance || X q| 



Finding Linear Transformations from Images of Basis Vectors 

Theorem 4.3.3 shows that if T is a matrix transformation, then the standard matrix for Tcan be obtained from the images of 
the standard basis vectors. Stated another way, a matrix transformation is completely determined by its images of the 
standard basis vectors. This is a special case of a more general result: If T\ V — * W is a linear transformation, and if 
(vi, V2, ..., v„} i s an Y basis for V, then the image T(y) of any vector v in V can be calculated from the images 

Tfrri.Tfa) T(y„) 

of the basis vectors. This can be done by first expressing v as a linear combination of the basis vectors, say 

and then using Formula 1 to write 

T(y)=c l T(y l ) I c 2 T(v 2 ) +- + ^T(v H ) 
In words, a linear transformation is completely determined by the images of any set of basis vectors. 



EXAMPLE 14 Computing with Images of Basis Vectors 

Consider the basis s= (vi,v 2 , v 3 ) for R 3 , where Vl = (1, 1, l)>v 2 = (1, 1, 0),and V3 = (1, 0, 0). Let T\R 3 — > R 2 be the 
linear transformation such that 

7( V1 ) = (1, 0), T(v 2 ) = (2, - 1), T(v 3 ) = (4, 3) 

Find a formula for T(x\ 9 *2- *3) ; t ^ ien use ^ s f° rmu l a to compute T(2, — 3, 5). 



Solution 

We first express x = (* 1 , *2* *3) as a ^ near combination of Vl = ( 1 ? \ ? 1 ) , V2 = ( 1 , 1,0), and V3 = ( 1 , 0, 0) • If we write 

(*l.*2.*3)=*l(U.l) I c 2 Cl- 1-0) I c 3 (l.Q,Q) 
then on equating corresponding components, we obtain 

c\ + c 2 I £3 = *1 
^1+^2 =*2 

c\ = x 3 

which yields c\ = x 3 , c 2 = x 2 — x^ c 3 = x \ —x 2 , so 

(^1^2-^3) =^3(1- 1-1) I C^2-^3)Cl-l-0) I (xi-JT2)Cl-0,0) 

= x 3 vi I (*2-*3)v2 I (*l~*2) v 3 

Thus 



From this formula, we obtain 



= x 3 (1.0) I C^2-^)(2, -1) I (*i-* 2 )(4,3) 
= (4^ i — 2^2 — ^3, 3^ i — 4^2 i *3) 

7(2, -3, 5) = (9, 23) 



In Section 4.2 we defined the composition of matrix transformations. The following definition extends that concept to 
general linear transformations. 



DEFINITION 



If T\:U * V and T 2 : V > W are linear transformations, then the composition of J^ with 7^, denoted by T 2 o T\ 

(which is read "7*2 circle 7i ")? is the function defined by the formula 



(7 2 o 70(h) = 72(7! (u)) 



(2) 



where u is a vector in U. 



Remark Observe that this definition requires that the domain of j* 2 (which is V) contain the range of 7^ ; this is essential for 
the formula 7 2 (7i (u)) to make sense (Figure 8.1.5). The reader should compare 2 to Formula 18 in Section 4.2. 



7, a 7 1 , 



f 




TitTim 



Figure 8.1.5 



The composition of 7^ w ^h T\ • 



The next result shows that the composition of two linear transformations is itself a linear transformation. 



THEOREM 8.1.2 




Proof If u and v are vectors in U and c is a scalar, then it follows from 2 and the linearity of 7^ and x 2 that 

(7 2 o7 1 )(u + v) = 7 2 (7 1 (u I v)) = 7' 2 C7' 1 Cu) I 7i(v)) 
= 7a (71(h)) I 7 2 (7i(v)) 
= (7 2 o7 1 )(u) + (7 20 7 1 )(v) 



and 

<T 2 a TO (cu) = r 2 CTi(i:u)) = T 2 (cT l (u)) 
= cT 2 (T 1 (n))=c(T 2 oT 1 )(u) 

Thus T 2 oT\ satisfies the two requirements of a linear transformation. 



EXAMPLE 15 Composition of Linear Transformations 



Let T\:P\ * P 2 anc * T 2 :P 2 * ^2 ^ e ^ e ^ near transformations given by the formulas 

7*i(j?0O) =xp(x) and T 2 (p(x)) = p(2x I 4) 
Then the composition (T 2 oT\):Pi ► P 2 is given by the formula 

(T2oT 1 Kp(xy) = T 2 (T l (p(x))) = T 2 (xp(x)) = (2?: + 4)p(2x + 4) 
In particular, if p ( x ) = c 4- c\x, then 

(T 20 T l )(p(x)) = (T2 T l )(cK + c l x) = (2x + 4)(cz + c l (2x + 4)) 



= c (2x I 4) I c { (2x I 4) J 



EXAMPLE 16 Composition with the Identity Operator 



IfT.V * V is any linear operator, and if /; V > f is the identity operator (Example 3), then for all vectors v in V , we 

have 

(7W)(v) = 7X/(v)) = 7Xv) 

(/o70(v)=/(T(v)) = 7'(v) 

It follows that 7 / and / f are the same as T ; that is, 



ToI=T and IoT=T 



(3) 



We conclude this section by noting that compositions can be defined for more than two linear transformations. For example, 
if 

T V U *V, T 2 :V — >W, and T 3 :W >Y 

are linear transformations, then the composition 7*3 o 72 o T\ is defined by 



(Figure 8.1.6). 



(7-3 7-2 7-0(11) = 7-3(^(^(11))) 



<Wr,Km 




r 3 <7-| (u» 



ffiT^T^a))) 



(4) 



Figure 8.1.6 



The composition of three linear transformations. 



Exercise Set 8.1 



o 



Click here for Just Ask! 



1. 



Use the definition of a linear operator that was given in this section to show that the function fR? > R? given by the 

formula T(x h x 2 ) = (*i + 2*2, 3*i - *2) is a linear °P erator - 



Use the definition of a linear transformation given in this section to show that the function T\R? 
2 - formula T{x\, x 2 , xj) = (2*i - * 2 I x 2 , x 2 - 4x 2 ) is a linear transformation. 
In Exercises 3-10 determine whether the function is a linear transformation. Justify your answer. 

T- V — > R> where Vis an inner product space, and T(\i) = ||u||- 
3. 



* R given by the 



T:R? > R?, where vg is a fixed vector in g? and T{\\) —\ix yq- 



5. 



T: M 22 * M 2 2, where B is a fixed 2x3 matrix and T(A) — AB- 



6. 



7: M nn — * R where T(A) = te{A)- 



F:M mn — > M nm , where F(A) = A 1 



8. 



T: M 22 — * R where 



(a) 



a h 
c d 



= 3a — 4b I c — d 



(b) 



T 



a h 
c d 



= a 2 + b 2 



T:P 2 >P2' where 



(a) T(a I a\x I a 2 x 2 )=a I a\(x I 1) I a 2 (x + \)' 



(b) T(a I a\x I a 2 x 2 ) = (aq + 1) + ( fl l + 1)* + («2 + l)* 2 



10. 



r:i?( - oo , oo ) * F( - oo , do ), where 



(a) ?-(/(*)) = !+/(*) 



(b) T(/0)) =/(*+!) 



11. 



Show that the function T in Example 9 is a linear operator. 



Consider the basis g= {vi, V2} f° r i? 2 > where V j = (1, 1) and v 2 — (1, 0), and let T:E? * J? 2 be the linear operator 

12 - such that 

7(v0 = (l,-2) and T(y 7 ) = (-4,1) 
Find a formula for T'fxj, x 2 ), an< ^ use that formula to find T{5, — 3)- 

Consider the basis £= {vi, v 2 } f° r i? 2 > where V j = ( — 2, 1) and v 2 = (1, 3)> an d let T: J? 2 * J? 3 be the linear 

l*** transformation such that 

T-(vi) = ( - 1, 2, 0) and T(v 2 ) = (0, - 3, 5) 
Find a formula for T(x\, x 2 ), and use that formula to find T{2, — 3)- 

Consider the basis S= { vi , v 2 , v 3 } for £ 3 , where Vl = ( 1 , 1 , 1 ), v 2 = ( 1 , 1 , 0), and V3 = ( 1 , 0, 0), and let 

T:B? i R? be the linear operator such that 

T(vi) = (2, - 1, 4), T(v 2 ) = (3, 0, 1), T(v 3 ) = ( - 1, 5, 1) 
Find a formula for T{x\, x 2 , x 3 ), and use that formula to find T{2, 4, — 1)- 

Consider the basis s= { Vl , v 2 , v 3 } for £ 3 , where Vl = (1, 2, 1), v 2 = (2, 9, 0), and V3 = (3, 3, 4), and let 
T: P? » J? 2 be the linear transformation such that 

7(vi) = (I, 0), T(y 7 ) = f- 1, 1), 7(v 3 ) = (0, 1) 

Find a formula for T^i, x 2 , x 3 )> an d use that formula to find T(l , 13, 7). 



16. 



Let vi, v 2 , and v 3 be vectors in a vector space V, and let T\ V ► B? be a linear transformation for which 

TCvO = (1, - 1, 2), 7(v 2 ) = (0, 3, 2), 7(v 3 ) = ( - 3, 1, 2) 
FindrC2vi-3v 2 + 4T 3 ). 



17. 



Find the domain and codomain of X 2 □ T\-> an d find (T 2 o T\) (x, y) 



(a) T l (x,y) = (2x,3y),T 2 (x,y) = (x-y,x \ y) 



(b) Ti(*,.y) = (*-3.y,0),T 2 (*,7) = (4x-5;/,3*-6y) 



(c) T\(x,y) = (2x, -3y,x h y), T 2 (x,y,z) = (x -y,y +z) 



(d) T 1 (x,y) = (x-y,y I z,x -z), T 2 (x,y,z) = (0, x +y + z) 



18. 



Find the domain and codomain of ^ x 2 q T\> and find (T^ o T 2 o T\) (x, y)- 



(a) T\(x,y) = (-2y, 3x,x-2y), T 2 (x,y,z) = (y,z,x), T?,(x,y,z) = (x-\-z,y-z) 



(b) Ti(x,y) = (x \y,y, -x),T 2 (x,y,z) = (Q,x \y Vz, 3y), T 3 (x,y,z) = (3x I 2y,4z-x-3y) 



19. 



Let T\ : M 22 > R and T 2 :M 22 > M 22 be the linear transformations given by T\ (A) — tr(A) and T 2 (A) —A 1 



(a) 



Find (T x Q T 2 ) {A), where A = 



a b 
c d 



(b) Can you find (J 2 a T\) (y4)? Explain. 



Let T\ :P n — * P n and T 2 :P n — > P n be the linear operators given by T\{p(x))=p(x-\) and T 2 (p(x)) = p(x + 1). 
20- Find (T X o T 2 ){ P {x)) and (J 2 Q T x ){ P {x)). 



21. 



Let T\ : V — ► V be the dilation T\ (v) = 4v- Find a linear operator T 2 :V — > V such that T\oT 2 = I and 7^ Ti = I- 



Suppose that the linear transformations T\:P 2 * P 2 and T 2 :P 2 > P3 are given by the formulas 

22 ' T x (p{x))=p{x I 1 ) and r 2 OW)=^(x). Find (raoTOCao I «i* I ^)- 



23. 



Let ^q (x) be a fixed polynomial of degree m, and define a function T with domain p by the formula 



(a) Show that T is a linear transformation. 



(b) What is the codomain of 77 



24. 



Use the definition of 7 3 7 2 o 7*i given by Formula 4 to prove that 



(a) 7 3 o ?2 o 7*1 * s a li near transformation 



(b) 7*3 o 7*2 o T\ = (7*3 o 7*2) o 7*1 



25. 



26. 



(c) T3 o 7 2 o T\ = T3 o (7 2 o T\ ) 
Let T\R 3 * R 3 be the orthogonal projection of J? 3 onto the *y -plane. Show that ToT=T- 



27. 



28. 



(a) Let T\ V > W be a linear transformation, and let k be a scalar. Define the function (kT) : V — > IF by 

(kT) (v) = £(7(v))- Show that £7 is a linear transformation. 



(b) Find (3T)(x h x 2 ) if T:R 2 , i? 2 is given by the formula 7(^ 1? ^ 2 ) = (2^ j -^ 2 ,^2 I *i> 



(a) Let 7 1 : jf > jf and 7 2 : J^ * W be linear transformations. Define the functions (7^ | T 2 ) : V — * W and 

(7i-7 2 ):^ — >W*>y 

C7i 1 r 2 )(Y) = ri(Y) + r 2 (Y) 

(7 1 -7 2 )(v) = 7 1 (v)-T 2 (v) 
Show that 7i 1 7 2 and 7i -7 2 are linear transformations. 

(b) Find (T\ I 7 2 ) (x, y) and (^ - 7 2 ) (*, y) if 7i : R 2 > £ 2 and 7 2 : £ 2 * £ 2 are g iven by the formulas 

7i (*,>>) = (2y, 3x) and 7 2 (*,>>) = (y,x). 



(a) Prove that if a 1, a 2 , £ ± , and £ 2 are an Y scalars, then the formula 

F(x,y) = (a\x + b\y,a 2 x + b^) 
defines a linear operator on r 2 . 

(b) Does the formula F(x,y) = (a \x 2 I b\y 2 , a 2 x 2 I b^y 2 ) define a linear operator on j? 2 ? Explain. 



Let (y 1? v 2? ..., v„} be a basis for a vector space V, and let 7* : f^ > {^ be a linear transformation. Show that if 

7(yi) = 7(v 2 ) = ■-■ = 7(v H ) = 0, then T is the zero transformation. 

Let (y 1? v 2 , ..., v H ) be a basis for a vector space V 9 and let 7; y > f^ be a linear operator. Show that if T(y\) = v\ 9 

7(v 2 ) = v 2 ,. . ., 7*(v„) = v„> then T is the identity transformation on V. 

31. (For Readers Who Have Studied Calculus) Let 

Z)(f )=/'(*) and J(f) = /"VCOA 



be the linear transformations in Examples Example 11 and Example 12. Find (Jo£))(f) for 



(a) f(*)=* 2 + 3* + 2 



(b) f(x) = sin* 
( C ) f(x)=e* + 3 



32. (For Readers Who Have Studied Calculus) Let F = C[a,b] be the vector space of functions continuous on [a, b] , 
and let T\ V * V be the transformation defined by 



T(f) = 5/(;0 I 3 f{t)dt 






Is 7 a linear operator? 



D/scuss/on 

DlSCOV&rV Indicate whether each statement is always true or sometimes false. Justify your answer by 



33. giving a logical argument or a counterexample. In each part, V and W are vector spaces. 



(a) If T(c\yi + C2V2) = ci^Oi) 4- C2T(y2) for a U vectors vi and V2 in V and all scalars 
ci and ^2^ then Tis a linear transformation. 



(b) If y is a nonzero vector in V, then there is exactly one linear transformation 
T:V — >W such that T( - v) = - T(v). 



(c) There is exactly one linear transformation T: V * W f° r which T(\\ + v) = T(u — v) 

for all vectors u and v in V. 



(d) If vq is a nonzero vector in V, then the formula T(y) = vq I v defines a linear operator 
on V. 



If 5 = (y 1? y 2? _._, v H } is a basis for a vector space V, how many different linear operators can 
34* be created that map each vector in B back into Bl Explain your reasoning. 

Refer to Section 4.4. Are the transformations from p n to p m that correspond to linear 
35- transformations from j? M+1 to j? m+1 necessarily linear transformations from p n to p m l 
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q o In this section we shall develop some basic properties of linear 

® ■ ^ transformations that generalize properties of matrix transformations obtained 

KERNEL AND RANGE earlier in the text 



Kernel and Range 

Recall that if A is an m x n matrix, then the nullspace of A consists of all vectors x in R n such that As. = 0, and by Theorem 5.5.1 
the column space of A consists of all vectors b in R™ for which there is at least one vector x ini! H such that ^4 X = b- From the 
viewpoint of matrix transformations, the nullspace of A consists of all vectors in R n that multiplication by A maps into 0, and 
the column space of A consists of all vectors in R m that are images of at least one vector in R™ under multiplication by A. The 
following definition extends these ideas to general linear transformations. 



DEFINITION 



If 7; y * W is a linear transformation, then the set of vectors in V that T maps into is called the kernel of T; it is denoted 

by ker(7). The set of all vectors in W that are images under T of at least one vector in V is called the range of T\ it is denoted 
by R(Ty 



EXAMPLE 1 Kernel and Range of a Matrix Transformation 



If Tj±.R n — * R m is multiplication by the m x n matrix A, then from the discussion preceding the definition above, the kernel of 
T^ is the nullspace of A, and the range of 7*^ is the column space of A. 



EXAMPLE 2 Kernel and Range of the Zero Transformation 



Let T\ V * W be the zero transformation (Example 2 of Section 8.1). Since Tmaps every vector in V into 0, it follows that 

ker(T) = V- Moreover, since is the only image under T of vectors in V, we have R(T) = {0} • 



EXAMPLE 3 Kernel and Range of the Identity Operator 



Let /; Y > y be the identity operator (Example 3 of Section 8.1). Since /( v ) = y for all vectors in V, every vector in Vis the 

image of some vector (namely, itself); thus R(J) = y. Since the only vector that / maps into is 0, it follows that ker(Z) = {0} • 



EXAMPLE 4 Kernel and Range of an Orthogonal Projection 



Let T\R 3 * i? 3 be the orthogonal projection on the xy-plane. The kernel of Tis the set of points that 7 maps into = (0, 0, 0) 

; these are the points on the z-axis (Figure 8.2.1a). Since T maps every point in ^ 3 into the xy -plane, the range of T must be 
some subset of this plane. But every point (^ 0? y 0? 0) in the ^y-plane is the image under T of some point; in fact, it is the image 
of all points on the vertical line that passes through (^ 0? y^ ? 0) (Figure 8.2. lb). Thus R(T) is the entire *y-plane. 

* 2 



kr 



m,ao) 



tit) ker(T) is the r-axis. 



+ : 



a fl .y M . :\ 



Figure 8.2.1 



f? — «= 

tt nT v n .O] 

{h) R<T} is the entire .vy-plane, 

(fO ker(T) is the z-axis. (b) R(T) is the entire *y-plane. 



EXAMPLE 5 Kernel and Range of a Rotation 



Let T:R 2 * R 2 be the linear operator that rotates each vector in the xy-plane through the angle Q. (Figure 8.2.2). Since every 

vector in the xy -plane can be obtained by rotating some vector through the angle {}. (why?), we have R(T) = R . Moreover, the 
only vector that rotates into is 0, so ker(T) = {0} • 




Figure 8.2.2 



Calculus RequireKernel of a Differentiation Transformation 



Let V = C ( — m, oo ) be the vector space of functions with continuous first derivatives on ( _ M , og ), let 
FF = i? ( — oo , oo ) be the vector space of all real- valued functions defined on ( _ oo , oo ) , and let £); jr — > RP be the 
differentiation transformation £3(f ) = /'(*)■ The kernel of £> is the set of functions in V with derivative zero. From calculus, 
this is the set of constant functions on ( _ oo , oo ) • 

Properties of Kernel and Range 

In all of the preceding examples, ker(7) and R(T) turned out to be subspaces. In Examples Example 2, Example 3, and 
Example 5 they were either the zero subspace or the entire vector space. In Example 4 the kernel was a line through the origin, 
and the range was a plane through the origin, both of which are subspaces of pX All of this is not accidental; it is a consequence 
of the following general result. 

THEOREM 8.2.1 



IfT.V * W is linear transformation, then 

(a) The kernel of T is a subspace of V. 

(b) The range ofTis a subspace ofW. 



Proof (a) To Show that ker(7) is a subspace, we must show that it contains at least one vector and is closed under addition and 
scalar multiplication. By part (a) of Theorem 8.1.1, the vector is in ker(7) , so this set contains at least one vector. Let vi and 
v 2 be vectors in ker(7), and let k be any scalar. Then 

Hvi | Y7 ) = T(y ] ) I T(v 7 )=0 I = 
so vi i v 2 is in ker(D- Also, 

T(kwi)=kT(yi)=ka = 
SO £vi is in ker(7). 



Proof (b) Since 7(0) = 0, there is at least one vector wR(T). Let w\ and w 2 be vectors in the range of T, and let k be any 
scalar. To prove this part, we must show that W1 _|- w ^ and ^ Wl are in the range of T\ that is, we must find vectors ^ and b in V 
such that 7(a) = wi I w 2 and 7(b) = jtw^ 

Since w\ and w 2 are in the range of 7, there are vectors ai and a 2 in V such that 7(ai) = wi an d T(a 2 ) = w 2 - Let a = ai + a 2 
and b = jtai- Then 

r(a) = r(ai I a^) = r(ai) I KmI^wj i m 
and 

7(b) = 7(iai) =kT(* { ) =kwi 
which completes the proof. 

In Section 5.6 we defined the rank of a matrix to be the dimension of its column (or row) space and the nullity to be the 
dimension of its nullspace. We now extend these definitions to general linear transformations. 



DEFINITION 



If X: V * If is a linear transformation, then the dimension of the range of T is called the rank of T and is denoted by 

rank (7) ; the dimension of the kernel is called the nullity of T and is denoted by nullity ( Ty 



If A is an m x n matrix and Tj±. R n — * R m is multiplication by A, then we know from Example 1 that the kernel of 7*^ is the 
nullspace of A and the range of 7*^ is the column space of A. Thus we have the following relationship between the rank and 
nullity of a matrix and the rank and nullity of the corresponding matrix transformation. 



THEOREM 8.2.2 



If A is an my,n matrix and Tj±.R n * R m is multiplication by A, then 



(a) nul%(T^= nullity (.4) 



(b) rank(T^=rank(^) 



EXAMPLE 7 Finding Rank and Nullity 



Let T^.R^ * i? 4 be multiplication by 



,4 = 



-1 


2 


4 


5 


-3 


3 


-7 2 





1 


4 


2 


-5 2 


4 


6 


1 


4 


-9 2 


-4 


-4 


7 



Find the rank and nullity of 7^. 



Solution 



In Example 1 of Section 5.6, we showed that rank (^4) = 2 and nullity(j4) = 4- Thus, from Theorem 8.2.2, we have rank(T j 4) = 2 
and nullity(7'^) = 4- 



EXAMPLE 8 Finding Rank and Nullity 



Let T:P? » i? 3 be the orthogonal projection on the xy-plane. From Example 4, the kernel of Tis the z-axis, which is 

one-dimensional, and the range of T is the xy-plane, which is two-dimensional. Thus 

nvtiity(T) = 1 and rank(T) = 2 



Dimension Theorem for Linear Transformations 

Recall from the Dimension Theorem for Matrices (Theorem 5.6.3) that if A is a matrix with n columns, then 

rank(j4) I nullity (A) =k 
The following theorem, whose proof is deferred to the end of the section, extends this result to general linear transformations. 

THEOREM 8.2.3 



Dimension Theorem for Linear Transformations 

IfT'.V > W is a linear transformation from an n-dimensional vector space V to a vector space W, then 

rank(T) I nullity ( T) =tf m 



In words, this theorem states that for linear transformations the rank plus the nullity is equal to the dimension of the domain. 
This theorem is also known as the Rank Theorem. 

Remark If A is an m x n matrix and T4: R n — > R m is multiplication by A, then the domain of 7^ has dimension n, so Theorem 
8.2.3 agrees with Theorem 5.6.3 in this case. 



EXAMPLE 9 Using the Dimension Theorem 



Let T:R 2 * J? 2 be the linear operator that rotates each vector in the xy -plane through an angle 0. We showed in Example 5 

thatker(r)= {0} and £(7) = R 2 . Thus 

rank(r) I nullity(r) = 2 + = 2 
which is consistent with the fact that the domain of T is two-dimensional. 



Additional Proof 



Proof of Theorem 8.2.3 We must show that 

&m(R(T)) I dun(ker(r))=tf 

We shall give the proof for the case where 1 <dim(ker(r)) <«■ The cases where dim(ker(r)) = and 
dun(ker(7)) =n are left as exercises. Assume dun(ker(7)) =r, and let vi,..., v r be a basis for the kernel. 
Since {v 1? ... ? \>} is linearly independent, Theorem 5.4. 6b states that there are n- r vectors, v r+ i, ..., v H 
, such that the extended set (y 1? ... ? \>,\>+i,..., v h } is a basis for V. To complete the proof, we shall 
show that the n — r vectors in the set s= {T(v r+ i ),..., T(v H )} form a basis for the range of 7". It will then 
follow that 

dim(jR(T)) I dimmer (Z)) = (m -r) + r = n 
First we show that S spans the range of T. If b is any vector in the range of T, then b = T(v) for some vector v in V. Since 



{v\ , . . ., v r , Y r +\ , . . ., v„ } is a basis for V, the vector v can be written in the form 

V = C 1 VI + - + CyYy + e> + i v,. + i + - + c H v H 

Since vi,..., Y r lie in the kernel of T, we have T(y\) =■-■= T(y r ) = 0, so 

b = T(y) = Cy+i T(y r +i ) + »■ 4- c n T(y n ) 
Thus 5 spans the range of T. 

Finally, we show that S is a linearly independent set and consequently forms a basis for the range of T. Suppose that some linear 
combination of the vectors in S is zero; that is, 

*r+irOr+l) +-4 jfc H 7Xv H ) =0 ^ 

We must show that k r +\= — = k n = ()- Since r is linear, 2 can be rewritten as 

7(£ r +iv r +i +»- + i M v M ) =0 

which says that £ r+1 v r +\ H h k n Y n is in the kernel of T. This vector can therefore be written as a linear combination of the 

basis vectors ( Vl? _._ ? V?J ) , say 

^r+1 v r +i + »■ + jfc„v H = jfci vi + - -f £ r v r 
Thus, 

ilVl +-+i r Vr-^-HVr + 1 £ M V M = 

Since (y 1? ._., v M ) is linearly independent, all of the fc's are zero; in particular, k r +\ = ■■■ = k n = ? which completes the proof. 



Exercise Set 8.2 

9- 



Click here for Just Ask! 



Let T:R 2 * J? 2 be the linear operator given by the formula 

L T{x,y) = {2x-y, - 81 I 4 7 ) 

Which of the following vectors are in j?(7)? 

(a) (1, -4) 

(b) (5,0) 

(c) (-3,12) 

Let 7; J? 2 j j? 2 be the linear operator in Exercise 1. Which of the following vectors are in ker(T) 

(a) (5,10) 

(b) (3,2) 



2. 



(c) (1,1) 

Let TR 4 > i? 3 be the linear transformation given by the formula 

3. 

T(x\, X2, x%, X4) = (4x\ I X2~ 2x^ — 3x4, 2x \ +X2 + x^— 4x4, 6x\ — 9x2 I $ x 4) 

Which of the following are in R(T)1 

(a) (0,0,6) 

(b) (1,3,0) 

(c) (2,4,1) 

Let T:R 4 * i? 3 be the linear transformation in Exercise 3. Which of the following are in ker(T)? 

4. 

(a) (3, - 8, 2, 0) 

(b) (0, 0, 0, 1) 

(c) (0, -4,1,0) 

Let f- P2 * P3 be the linear transformation defined by T(p (x) ) = xp (x) • Which of the following are in ker (7) ? 

(a) x 2 

(b) 

(c) 1+x 

Let T: P2 > P3 be the linear transformation in Exercise 5. Which of the following are in R(T) ? 

6. 

(a) x+x 2 

(b) 1+* 

(c) 3-x 2 



Find a basis for the kernel of 
7. 

(a) the linear operator in Exercise 1 

(b) the linear transformation in Exercise 3 

(c) the linear transformation in Exercise 5. 

Find a basis for the range of 
8. 

(a) the linear operator in Exercise 1 

(b) the linear transformation in Exercise 3 

(c) the linear transformation in Exercise 5. 

Verify Formula 1 of the dimension theorem for 
9. 

(a) the linear operator in Exercise 1 

(b) the linear transformation in Exercise 3 

(c) the linear transformation in Exercise 5. 

In Exercises 10-13 let 7 be multiplication by the matrix A. Find 

(a) a basis for the range of T 

(b) a basis for the kernel of T 

(c) the rank and nullity of T 

(d) the rank and nullity of A 



10 A = 



1 


-1 


3" 


5 


6 


-4 


7 


4 


2 



11. 



A = 



12. A = 



13. 



A = 



2 


-1 








4 


-2 




















4 1 


5 2" 








1 2 


3 0_ 








1 


4 


5 





9 


3 


-2 


1 





-1 


-1 





-1 





-1 


2 


3 


5 


1 


8 



14. 



Describe the kernel and range of 



(a) the orthogonal projection on the ^-plane 



(b) the orthogonal projection on the yz-plane 



(c) the orthogonal projection on the plane defined by the equation y = x 



15. 



Let V be any vector space, and let T: V > V be defined by T(y) = 3v- 



(a) What is the kernel of 77 



(b) What is the range of 77 



16. 



In each part, use the given information to find the nullity of T. 



(a) T\R 5 * R 7 has rank 3. 



(b) 7 : p 4 > p 3 has rank 1 . 



(c) The range of 7 : j? 6 ■, J? 3 is J? 3 . 



(d) T: M 2 2 — > M 2 2 has rank 3 - 



Let A be a 7 x 6 matrix such that j\x = has only the trivial solution, and let T\R^ 
1 ' • rank and nullity of T. 



R 7 be multiplication by A. Find the 



18. 



Let A be a 5 x 7 matrix with rank 4. 



(a) What is the dimension of the solution space of Ax, = 0? 

(b) Is Ax = h consistent for all vectors b in R 5 ? Explain. 



Let 7 : R 3 > $r be a linear transformation from R 3 to any vector space. Show that the kernel of T is a line through the 

"• origin, a plane through the origin, the origin only, or all of R 3 . 

Let 7 : v > J? 3 be a linear transformation from any vector space to R 3 . Show that the range of T is a line through the 

20. or igi n? a plane through the origin, the origin only, or all of R 3 . 

Let TR 3 * i? 3 be multiplication by 

21 ' M 3 4 

3 4 7 
-2 2 



(a) Show that the kernel of r is a line through the origin, and find parametric equations for it. 

(b) Show that the range of T is a plane through the origin, and find an equation for it. 



Prove: If { Vl , Y2, . . ., v H } is a basis for V and wi , W2, . . . , w H are vectors in W, not necessarily distinct, then there exists a 

22- linear transformation T\ V > W such that 

T(v]) =wi, T(v^) =w^..-, T(v„) =w„ 

For the positive integer M > l, let T: M nn > R be the linear transformation defined by T(A) = tr(j4)> for A an ^ x « matrix 

23- w ith real entries. Determine the dimension of ker(T)- 

Prove the dimension theorem in the cases 
24. 

(a) dim(ker(D)=0 

(b) dim(ker(T)) = w 



25. (For Readers Who Have Studied Calculus) Let D:P^ * P2 ^ e ^ e differentiation transformation D(p) = p*(x). 

Describe the kernel of D. 



(For Readers Who Have Studied Calculus) Let J- j p 1 » j? be the integration transformation J(p) = f p(x)dx- 



26. 

Describe the kernel of 7. 



27. (For Readers Who Have Studied Calculus) Let R : y — * W be the differentiation transformation D(i)= f*(x), where 
V = C 3 ( - do , do ) and W = F(- do , do )• Describe the kernels of DoD and DoDoD 



Discussion 
Disoov&ry 



Fill in the blanks. 
28. 



(a) If T4: R n — > R m is multiplication by A, then the nullspace of A corresponds to the 
of 7^, and the column space of A corresponds to the of 7*^. 



(b) If T:R 3 > j? 3 is the orthogonal projection on the plane x + y + z = 0, then the kernel of 

T is the line through the origin that is parallel to the vector . 



(c) If V is a finite-dimensional vector space and T\ V — * W is a linear transformation, then 
the dimension of the range of T plus the dimension of the kernel of T is . 



(d) If Tj[\R 5 * R 3 is multiplication by A, and if rankfT^ = 2, then the general solution of 

Jbt = has (howman y?) parameters. 



29. 



(a) If T:R 3 > R 3 is a linear operator, and if the kernel of T is a line through the origin, then 

what kind of geometric object is the range of 77 Explain your reasoning. 

(b) If T\R 3 > R 3 is a linear operator, and if the range of T is a plane through the origin, then 

what kind of geometric object is the kernel of 77 Explain your reasoning. 



30. (For Readers Who Have Studied Calculus) Let V be the vector space of real-valued functions 

with continuous derivatives of all orders on the interval ( _ 00 , do ), and lttW = F(— do , do ) 
be the vector space of real-valued functions defined on ( _ 00 ? 00 ) • 

(a) Find a linear transformation T\ V > W whose kernel is p 3 . 

(b) Find a linear transformation T: V > ffi whose kernel is p . 



If A is an m x n matrix, and if the linear system Ax = h is consistent for every vector h in R m , 
31- what can you say about the range of Tj[:R n — * R m 7 
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8.3 

INVERSE LINEAR 
TRANSFORMATIONS 



In Section 4.3 we discussed properties of one-to-one linear transformations 
from R n to R m . In this section we shall extend those ideas to more general 
kinds of linear transformations. 



Recall from Section 4.3 that a linear transformation from R" to R m is called one-to-one if it maps distinct vectors in R n into 
distinct vectors in R m . The following definition generalizes that idea. 



DEFINITION 



A linear transformation T\ V — * W is said to be one-to-one if T maps distinct vectors in V into distinct vectors in W. 



EXAMPLE 1 A One-to-One Linear Transformation 



Recall from Theorem 4.3.1 that if A is an M x « matrix and T^.R 7 * * R n is multiplication by A, then 7*^ is one-to-one if 

and only if A is an invertible matrix. 



EXAMPLE 2 A One-to-One Linear Transformation 



LetT:R 



H + l 



be the linear transformation 



T(p)=T(p(x))=xp(x) 
discussed in Example 8 of Section 8.1. If 

p= p(x) =cq \-c\x-\ \-c n x n and q = q(x) = d$ + d\x H hd^x^ 

are distinct polynomials, then they differ in at least one coefficient. Thus, 

7(p) = cqx + cix 2 + ■■■■■ + c H *" +1 and 7*(q) = d^x -h ^i^: 2 -h ■» 4- d? H *" +1 

also differ in at least one coefficient. Thus T is one-to-one, since it maps distinct polynomials P and <1 into distinct 
polynomials T(-p) and T(q)- 



EXAMPLE 3 A Transformation That Is Not One-to-One 



Calculus Required 



Let 



D\C l (- OQ, do) ^(- M, oo) 



be the differentiation transformation discussed in Example 1 1 of Section 8.1. This linear transformation is not one-to-one 
because it maps functions that differ by a constant into the same function. For example, 

D(x 2 )=D(x 2 + 1) = 2* 
The following theorem establishes a relationship between a one-to-one linear transformation and its kernel. 
THEOREM 8.3.1 



Equivalent Statements 

IfT\V > W is a linear transformation, then the following are equivalent. 

(a) T is one-to-one. 

(b) The kernel of T contains only the zero vector; that is, ker(T) = {0} • 

(c) nullity(r) = 0. 



Proof The equivalence of (b) and (c) is immediate from the definition of nullity. We shall complete the proof by proving 
the equivalence of (a) and (b). 

{a) ^> [b) Assume that 7 is one-to-one, and let v be any vector in ker(T)- Since v and both lie in ker(T), we have T(y) = 
and 7(0) = 0, so T(y) = T(0)- But this implies that v = 0, since 7 is one-to-one; thus ker(T) contains only the zero vector. 

(6) => (a) Assume that ker(T) = {0} and that y and w are distinct vectors in V; that is, 

v-w*0 (1) 

To prove that Tis one-to-one, we must show that T(v) and 7*(w) are distinct vectors. But if this were not so, then we would 
have T(v) = T(w)> Therefore, 

7( v ) _ 7( w ) = o or T(y - w) = 

and so v — w is in the kernel of T. Since ker(T) = {0} , this implies that v — w = 0. which contradicts 1. Thus T(y) and 
7(w) must be distinct. 



EXAMPLE 4 Using Theorem 8.3.1 



In each part, determine whether the linear transformation is one-to-one by finding the kernel or the nullity and applying 
Theorem 8.3.1. 



(a) T:R 2 * R 2 rotates each vector through the angle Q. 



(b) XR^ > i? 3 * s ^ e orthogonal projection on the *y-plane. 



(c) T\R 6 * i? 4 is multiplication by the matrix 



,4 = 



1 


2 


4 


5 


-3 


3 


-7 2 





1 


4 


2 


-5 2 


4 


6 


1 


4 


-9 2 


-4 


-4 


7 



Solution (a) 

From Example 5 of Section 8.2, ker(T) = {0} » so Tis one-to-one. 

Solution (b) 

From Example 4 of Section 8.2, ker(T) contains nonzero vectors, so Tis not one-to-one. 

Solution (c) 

From Example 7 of Section 8.2, nullity (T) = 4, so Tis not one-to-one. 

In the special case where Tis a linear operator on & finite-dimensional vector space, a fourth equivalent statement can be 
added to those in Theorem 8.3.1. 



THEOREM 8.3.2 



If Vis a finite-dimensional vector space, and T\ V * V is a linear operator, then the following are equivalent. 



(a) T is one-to-one. 



(b) ker(D= {0}. 



(c) nullity(r) = 0. 



(d) The range ofTis V; that is, R(T) =V- 



Proof We already know that (a), (b), and (c) are equivalent, so we can complete the proof by proving the equivalence of (c) 
and (d). 

[c] ^ {d) Suppose that dim(F) = n an d nullity (7) = 0- It follows from the Dimension Theorem (Theorem 8.2.3) that 

rank(T) = n — nullity (T) = n 

By definition, rank(7) is the dimension of the range of 7, so the range of T has dimension n. It now follows from Theorem 
5.4.7 that the range of 7 is V , since the two spaces have the same dimension. 

(d) => (c) Suppose that dim(F) = « and i?(7) = F" . It follows from these relationships that dim(i?(7)) = «> or, equivalently, 
rank(7) = »■ Thus it follows from the Dimension Theorem (Theorem 8.2.3) that 

nullity (T) = h — rank(T) = m — m = 



EXAMPLE 5 A Transformation That Is Not One-To-One 



Let 7 j: J? 4 ► J? 4 be multiplication by 



,4 = 



1 


3 


-2 4" 


2 


6 


-4 8 


3 


9 


1 5 


1 


1 


4 3 



Determine whether 7^ is one-to-one. 



Solution 



As noted in Example 1, the given problem is equivalent to determining whether A is invertible. But det(jl) = 0? since the 
first two rows of A are proportional, and consequently, A is not invertible. Thus 7^ is not one-to-one. 



Inverse Linear Transformations 

In Section 4.3 we defined the inverse of a one-to-one matrix operator Tj[\R n * R n to be T 1 :R n > R n , and we 

showed that if w is the image of a vector x under 7^, then 7 1 = 7^ maps w back into x . We shall now extend these ideas 
to general linear transformations. 

Recall that if 7; $r * Jf is a linear transformation, then the range of 7, denoted by R(T) 9 is the subspace of W consisting of 

all images under T of vectors in V. If T is one-to-one, then each vector w in R(T) is the image of a unique vector v in V. This 
uniqueness allows us to define a new function, called the inverse of T and denoted by 7 _1 , that maps w back into v (Figure 

8.3.1). 




Figure 8.3.1 



The inverse of T maps T(y) back into v . 



It can be proved (Exercise 19) that T \R(T) > V is a linear transformation. Moreover, it follows from the definition of 



7 _1 that 



T- l (T(v)) = T-\w)=v (2a ) 



T(T- l (w))=T(v)=w (2b) 



so that T and 7 _1 , when applied in succession in either order, cancel the effect of one another. 

Remark It is important to note that if T\ V * W is a one-to-one linear transformation, then the domain of 7" 1 is the 

range of T. The range may or may not be all of W. However, in the special case where T: V > f^ is a one-to-one linear 

operator, it follows from Theorem 8.3.2 that R(T) = V, that is, the domain of 7 _1 is all of V. 



EXAMPLE 6 An Inverse Transformation 



In Example 2 we showed that the linear transformation T:P n * Pn+i gi ven by 

is one-to-one; thus, Thas an inverse. Here the range of Tis not all of P n _^i\ rather, R(T) is the subspace of jP h ■ j consisting 
of polynomials with a zero constant term. This is evident from the formula for T: 

T(cq I c\x-{ 1 c yi x ri )=CQX \-c\x H hc H * 

It follows that j -1 ■ pfT) > P is given by the formula 

T~ (c\}X + c\x -\ r-c H x" + ) =^o + ^l^ + - + £„*" 

For example, in the case where n > 3, 

T~ l (2x-x 2 i 5x 3 \ 3x A )=2-x \ 5x 2 + 3x 3 



EXAMPLE 7 An Inverse Transformation 



Let T:R 3 » R 3 be the linear operator defined by the formula 

T(x], X7, xi) = (3j:i I X7, -2xi -4x^ I 3x^ ; 5j:i I 4x3 - 2x^) 
Determine whether Tis one-to-one; if so, find j _1 r K] x? x „\\. 

Solution 

From Theorem 4.3.3, the standard matrix for Tis 

[T] = 

(verify). This matrix is invertible, and from Formula 1 of Section 4.3, the standard matrix for 7 7-1 is 



3 1 

-2 -4 3 

5 4-2 



It follows that 



*-H 













4 


-2 


— 


3 




[T~ 1 ] = [T]- 1 = 


-11 6 9 
-12 7 10 


*1~ 


, 


"*l" 




4 -2 -3" 


"*l" 






*2 


\=[T~ l ] 


*2 


= 


-11 6 9 


*2 


= 




*3 


1 


*3 




-12 


7 10 


*3 







4*i — 2*2 — 3*3 

— ll^i i 6*2 I 9*3 

- 12*! I 7*2 I 10*3 



Expressing this result in horizontal notation yields 



T 1 (*i,*2,*3) = (4*l-2*2-3*3, -11*1 I 6*2 I 9*3, -12*1 I 7*2+10*3) 



The following theorem shows that a composition of one-to-one linear transformations is one-to-one, and it relates the inverse 
of the composition to the inverses of the individual linear transformations. 



THEOREM 8.3.3 



IfTy.U ► V an d T 2 .V ► W are one-to-one linear transformations, then 



( a ) T 2 o T\ ^ one-to-one. 



— 1 rp — 1 T 1 — 1 



(b) (7 2 o7 1 )" 1 =rf 1 o7 2 " 



Proof (a) We want to show that 7 2 o T\ maps distinct vectors in U into distinct vectors in W. But if u and v are distinct 
vectors in U 9 then T\ (u) and T\ (v) are distinct vectors in V since X\ is one-to-one. This and the fact that 7^ is one-to-one 
imply that 

72(71 (u)) and T 2 (T { ( V )) 

are also distinct vectors. But these expressions can also be written as 

(7- 20 7-i)(u) and (T 2 o7i)(v) 
so t 2 o T\ maps u and v into distinct vectors in 1/1/. 



Proof (b) We want to show that 



(72o7i)- 1 (w) = (T 1 - 1 o7 2 - 1 )(w) 



for every vector w in the range of T 2 oT\- For this purpose, let 



-i, 



so our goal is to show that 



u=cr 2 or 1 )- 1 (w) 



n=C7T 1 o7a- 1 )(w) 



(3) 



But it follows from 3 that 

(72o7i)(u) =w 

or, equivalently, 

72(ri(n)) =w 

Now, taking j-i of each side of this equation and then y-i of each side of the result and using 2a 
yields (verify) 



or, equivalently, 



n = 77 1 (77 1 (w)) 



n=(77 1 o77 1 )(w) 



In words, part (b) of Theorem 8.3.3 states that the inverse of a composition is the composition of the inverses in the reverse 
order. This result can be extended to compositions of three or more linear transformations; for example, 

(T 3 oT 2 oTi)~ =Tf oT 2 ~ oTf (4) 



In the case where 7^, 7^, and 7^ are matrix operators on ^ H , Formula 4 can be written as 
which we might also write as 






(Tcba) * = 7^.^-1 (5) 

In words, this formula states that the standard matrix for the inverse of a composition is the product of the inverses of the 
standard matrices of the individual operators in the reverse order. 

Some problems that use Formula 5 are given in the exercises. 

Dimension of Domain and Codomain 

In Exercise 16 you are asked to show the important fact that if V and W are finite-dimensional vector spaces with 
dim (IF) < dim(F) 9 and if 7; V — y W is a linear transformation, then T cannot be one-to-one. In other words, the dimension 
of the codomain W must be at least as large as the dimension of the domain V for there to be a one-to-one linear 
transformation from V to W. This means, for example, that there can be no one-to-one linear transformation from space R 3 to 
the plane J? 2 . 



EXAMPLE 8 Dimension and One-to-One Linear Transformations 

A linear transformation 7^ from the plane R 2 to the real line R has a standard matrix 

,4= [a b] 
If u = (x 9 y) i s a point in j? 2 , its image is 

which is a scalar. But if a x f by = k, say, then there are infinitely many other points v in p 2 that also have T^(v) = k, since 
there are infinitely many points on the line 

ax + by = k 

This is because if a and b are nonzero, then every point of the form 



k-by 

has Ty[(y) = k, whereas if a = but b is nonzero, then every point of the form 

v=(*,0) 
has Ty[(y) = 0, and if £. = Q but a is nonzero, then every point of the form 

v=(0,.y) 
has Tjifv) = 0- Finally, in the degenerate case a = Q and £. = Q, we have T^(v) = for every v in j? 2 . 

In each case, T fails to be one-to-one, so there can be no transformation from the plane to the real line that is both linear and 

one-to-one. 

4 

Of course, even if dim (IF) > dim(F) 9 a linear transformation from V to W might not be one-to-one, as the zero 
transformation shows. 



Exercise Set 8.3 



o 



Click here for Just Ask! 



In each part, find ker(T)> and determine whether the linear transformation J is one-to-one. 



(a) T:R 2 — * R 2 , where T(x, y) = (y, x) 



(b) T:R 2 — » R 2 , where T(x,y) = (0, 2x I 3y) 



(c) 7; R 2 — , R 2 , where T(x,y) = (x + y,x-y) 



(d) 7; £ 2 — , R 3 , where T(x, y) = (x,y,x+y) 



(e) T:R 2 — » £ 3 , where T(x, y) = (x -y, y - x, 2x - 2y) 



(f) T:R 2 — > R 2 , where T(x, y,z) = (x + y -\ z, x -y -z) 



2. 



In each part, let T:R 2 ■* R 2 be multiplication by A. Determine whether Thas an inverse; if so, find 



,-1 



*1 
*2 



(a) A _ 



5 2 
2 1 



(b) 



A = 



6 -3 
4 -2 



^ A = 



4 7 
-1 3 



3. 



In each part, let T\F? ► J? 3 be multiplication by A. Determine whether Thas an inverse; if so, find 



,-1 



*1 
*3 



(a) 



A = 



1 


5 


2" 


1 


2 


1 


-1 


1 






(b) 



A = 



1 4 


-1" 


1 2 


1 


-1 1 






(c) 



A = 



"1 





r 





1 


i 


1 


1 






(d) 



,4 = 



1 


-1 


1 





2 


-1 


2 


3 






4. 



In each part, determine whether multiplication by A is a one-to-one linear transformation. 



(a) 



A = 



1 -2 

2 -4 
-3 6 



(b) 



A = 



1 3 5 7 

2 -12 4 
-13 



(c) 



A = 



4 -2 
1 5 

5 3 



As indicated in the accompanying figure, let XR 2 * R 2 ^ e ^e orthogonal projection on the line y = x. 



(a) Find the kernel of T. 



(b) Is T one-to-one? Justify your conclusion. 




Figure Ex-5 



6. 



As indicated in the accompanying figure, let T\R? > R? be the linear operator that reflects each point about the y-axis. 



(a) Find the kernel of T. 



(b) Is T one-to-one? Justify your conclusion. 



+ v 



nxj •- — ~ 



^x 



1 



Figure Ex-6 



7. 



In each part, use the given information to determine whether the linear transformation T is one-to-one. 



(a) T:R™ > R m ; nullity (D = 



(b) T\R" — >R^ rank(D = « - 1 



(c) 7:£ m ^£";*< 



/« 



(d) T:R n ^R n \R{T)=R n 



8. 



In each part, determine whether the linear transformation T is one-to-one. 



(a) T:P 2 — > Py where Tfcn I au I a7X 2 )=x(an I ai* I a?* 2 ) 

(b) T:P 2 — > P 2 ' where T(p(x))=p(x + 1) 

Let A be a square matrix such that det(j4) = 0- Is multiplication by A a one-to-one linear transformation? Justify your 
"• conclusion. 

In each part, determine whether the linear operator T\R n > R n * s one-to-one; if so, find 7 -1 (^ ^ -.-, *„)• 

J-Vf • 

(a) 7*0 1, x 2 , ---, * H ) = (0, *1, *2, ---> *h-i) 

(b) 7*0 1, * 2 , ---, * M ) = Oh, *h-1. — - *2, *i) 

(c) 7*0 1, X2, ---, * M ) = O2, *3, ---> *h> *l) 

Let 7:^" — > i? M be the linear operator defined by the formula 

7*0 1 , X7, ... , * M ) = Oi*i , ayX7, ..., a»x») 
where a 1 , . . . , a n are constants. 

(a) Under what conditions will T have an inverse? 

(b) Assuming that the conditions determined in part (a) are satisfied, find a formula for 7 -1 (^ 1? x 2 , ___ ? ^ ) . 



Let TV j? 2 > J? 2 and TrR* > R 2 be the linear operators given by the formulas 

12 

TiO,j) = (x-\-y,x-y) and 7*20,7) = (2* + y,x - 2y) 



(a) Show that T\ and r 2 are one-to-one. 

(b) Find formulas for r -l ^ ^, and 7-1 ^ ^), and (7 2 o 7*0 _1 0, 7> 

(c) Verify that (7 2 o 70 _1 = Tf 1 o 7 2 -1 • 



Let 7 1 : P2 > P 3 and 7 2 : P3 > P3 be the linear transformations given by the formulas 

13 ' ' T 1 (p(x))=xp(x) and TaOO)) = ;?(* + 1) 



(a) Find formulas for r -l (^ ( x ) ) , T" 1 (^ (*) ) , and (T 2 t { ) _1 fj? (x) ) • 



— 1 ST? — 1 ST? — 1 



(b) Verify that (7 2 o 70 _1 = Tf 1 o 7 2 " 



Let Tji.R 3 ► i? 3 > Tg:R 3 > R 3 > an d Tc'.R 3 > R 3 be ^ e reflections about the *y-plane, the ^-plane, and the yz 

14- -plane, respectively. Verify Formula 5 for these linear operators. 



15. 



Let T:P\ * R 2 bt the function defined by the formula 

T(p(x)) = (p(0),p(l)) 



(a) Find 7(1 -2*). 



(b) Show that T is a linear transformation. 



(c) Show that T is one-to-one. 



,-1 



(d) Find T l (2, 3), and sketch its graph. 



Prove: If V and Ware finite-dimensional vector spaces such that dim JF< dim V> then there is no one-to-one linear 
16. transformation T: V — * W- 



17. 



In each part, determine whether the linear operator 7* : jy 2 2 ► Jl^22 * s one "to-one. If so, find 



*-l 



a b 
c d 



(a) 



a b 
c d 



a 
d 



(b) 



a b 
c d 



a c 
b d 



(c) 



a b 
c d 



d -b 
— c a 



Let T\R 2 * R 2 be the linear operator given by the formula T(x, y) = (x 4- ky, —y)- Show that Tis one-to-one for 

■*■"• every real value of k and that j -1 _ m. 



19. 



Prove that if 7; ^ > Jf is a one-to-one linear transformation, then T \R(T) > f^ is a linear transformation. 



20. 



/■I 
(For Readers Who Have Studied Calculus) Let J : p 1 — ► R be the integration transformation j(n)= j p(x)dx- 



Determine whether / is one-to-one. Justify your conclusion. 



21. (For Readers Who Have Studied Calculus) Let V be the vector space C 1 [0, 1 ] and let T: V — > R be defined by 
F(f ) — / (0) I 2/ ' (0) + 3/ ' ( 1 ) . Verify that T is a linear transformation. Determine whether T is one-to-one. Justify 
your conclusion. 



In Exercises 22 and 23, determine whether Ti o T2 = 7*2 o TV 



22. 



(a) Ty.R 2 * R 2 * s the orthogonal projection on the x-axis, and T^.R 2 * R 2 * s th e orthogonal projection on the 

j-axis. 



(b) Ty.R 2 * R 2 is the rotation about the origin through an angle # 1? and T^.R 2 * R 2 ^ s the rotation about the 

origin through an angle # 2 - 

(c) Ty.R^ * R 3 i s the rotation about the x-axis through an angle # 1? and 7Y-£ 3 > R^* s the r °tation about the 

z-axis through an angle # 2 - 



23. 



(a) Ty.R 2 > R 2 * s the reflection about the x-axis, and T^.R 2 * R 2 * s the reflection about the y-axis. 



(b) Ty.R 2 > R 2 i s the orthogonal projection on the x-axis, and Ty.R 2 * R 2 * s the counterclockwise rotation 

through an angle ff. 

(c) Ty.R^ * R 3 i s a dilation by a factor k, and Tr.R? * R 3 * s the counterclockwise rotation about the z-axis 

through an angle 0. 



Discussion 
Discov&ry 



24. 



Indicate whether each statement is always true or sometimes false. Justify your answer by 
giving a logical argument or a counterexample. 

(a) If T\R 2 * R 2 i s the orthogonal projection onto the x-axis, then 7 -1 R 2 > R 2 

maps each point on the x-axis onto a line that is perpendicular to the x-axis. 



(b) If t*j : u > p"" and 7 2 : V * JF are linear transformations, and if 7^ is not 

one-to-one, then neither is 7 2 o T\ • 



(c) In the *y-plane, a rotation about the origin followed by a reflection about a coordinate 
axis is one-to-one. 



Does the formula T(a, b,c) =ax 2 4- bx 4- c define a one-to-one linear transformation from j? 3 

25 

to f^? Explain your reasoning. 

Let E be a fixed 2x2 elementary matrix. Does the formula T(A) = EA define a one-to-one 
2"- linear operator on M22? Explain your reasoning. 

Let a be a fixed vector in j? 3 . Does the formula T(y) =axv define a one-to-one linear 
**• operator on j? 3 ? Explain your reasoning. 



28. (For Readers Who Have Studied Calculus) The Fundamental Theorem of Calculus implies 
that integration and differentiation reverse the actions of each other in some sense. Define a 
transformation D.p^ > j P H _ 1 by D(P(x)) = p*(x), and define J-.P^-i > P H b Y 

J(p(x))= fp(£)dt. 
Jo 

(a) Show that D and / are linear transformations. 

(b) Explain why / is not the inverse transformation of D. 



(c) Can we restrict the domains and/or codomains of D and / such that they are inverse 
linear transformations of each other? 
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8.4 

MATRICES OF GENERAL 

LINEAR 

TRANSFORMATIONS 



In this section we shall show that if V and W are finite-dimensional vector 
spaces (not necessarily R n and R m ), then with a little ingenuity any linear 
transformation J: V^ W can be regarded as a matrix transformation. The 
basic idea is to work with coordinate vectors rather than with the vectors 
themselves. 



Matrices of Linear Transformations 

Suppose that Vis an ^-dimensional vector space and Wan m-dimensional vector space. If we choose bases B and B f for V and 
W, respectively, then for each x in V, the coordinate vector [ x ] s will be a vector in R™, and the coordinate vector [T(x) ] g* will 
be a vector in R m (Figure 8.4.1). 



A vector 

mV 

(^-dimensional) 



> 



* Tix) 



A vector 

in 






c 



A vector 
in W 

<m-dlmciisiona]| 



n*)i a * / 



A vector 

inR m 



Figure 8.4.1 

Suppose 7; y > jp is a linear transformation. If, as illustrated in Figure 8.4.2, we complete the rectangle suggested by Figure 

8.4.1, we obtain a mapping from R™ to R m , which can be shown to be a linear transformation. (This is the correspondence 
discussed in Section and 4.3 we studied linear transformations from .) If we let A be the standard matrix for this transformation, 
then 



^rxi R = rr(x)i R - 



The matrix A in 1 is called the matrix for T with respect to the bases B and B f . 



(i) 



T maps 



V 



* ■ 

I 



M* 



+ Tfrl 




+ Mb. 



Mullipliiz-Jlion 

by A 



Figure 8.4.2 



Later in this section, we shall give some of the uses of the matrix A in 1, but first, let us show how it can be computed. For this 
purpose, let 3 = { Ul 9 u 2 , . . ., u H } be a basis for the ^-dimensional space V and R f = { Vu Y9; __ ; Vm ) a basis for the 
m-dimensional space W. We are looking for an m x n matrix 



a n 


«12 ■ 


" ^1h 


^21 


^22 ■ 


" a 2k 


«ml 


a m 2 " 


1 fl mH 



A = 



such that 1 holds for all vectors x in V, meaning that A times the coordinate vector of x equals the coordinate vector of the image 
T(x) of x- hi particular, we want this equation to hold for the basis vectors uj, 112, • • ., u H ; that is, 



A\vi ] ] a = [T(ui)l B », A\n 7 ] B = [T-Cn^l B f 



^[u„l R = [T(u„)l R' 



But 



so 



[«i]b = 



[«2]b = 



^[ui] 5 = 



^[«2l5 = 



«11 «12 

«21 ^22 

«ml a m2 

<*2\ «22 

a m\ a m2 



\*n\B = 





Y 








^1h" 




"an" 


«2m 


= 


«21 


a rrm 







«ml 



«2h 



"12 
«22 

fl m2 



(2) 



Substituting these results into 2 yields 



A[u n ] B = 



an an 
«21 fl 22 



"an" 




"fll2 


«21 


= [7'(tll)]B', 


A 22 


Ami 




Am2 





"o" 




^Ik" 








AIM 


^2h 


= 


^2h 


fl fflH 


1 




fl mH 









= [T(u 2 )]^..., 



^2h 



= [r(u«)]B' 



which shows that the successive columns of A are the coordinate vectors of 

TXui), T(u 2 ) T(u H ) 

with respect to the basis B 1 '. Thus the matrix for T with respect to the bases B and B' is 

^=[[nui)]^l[T(u 2 )]^|-l[T(u H )]^] 

This matrix will be denoted by the symbol 

\T\b\b 
so the preceding formula can also be written as 

[T] B > B =[[T(u 1 )] B .\[T(u 2 )]M[T(u n )] B .] 



(3) 



(4) 



and from 1, this matrix has the property 



[T] B > !B [x] B =[T(x)] B . 



(4a) 



Remark Observe that in the notation [T] B ' B the right subscript is a basis for the domain of T, and the left subscript is a basis 
for the image space of T (Figure 8.4.3). Moreover, observe howthe subscript B seems to "cancel out" in Formula 4a (Figure 

8.4.4). 

4 
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Matrices of Linear Operators 



In the special case where V = W (so that 7; V > V is a linear operator), it is usual to take B = B f when constructing a matrix 

for 7. In this case the resulting matrix is called the matrix for T with respect to the basis B and is usually denoted by [T] s 

rather than [ T] £ g. If B = {u\ , 112, . . ., u H } , then Formulas 4 and 4a become 



[^B=[[r(ui)] 5 l[T(u 2 )] B H[T(u H )] B ] 



(5) 



and 



[T] B [x] B =[T(x)] B (5a) 

Phrased informally, 4a and 5a state that £/^ matrix for T times the coordinate vector for x is the coordinate vector for T(x)- 



EXAMPLE 1 Matrix for a Linear Transformation 



Let T:P\ * P2 ^ e ^e linear transformation defined by 

T{p{x))=xp{x) 
Find the matrix for T with respect to the standard bases 

B= (111,112) and B f = (vi, v 2 , v 3 ) 
where 

ui = l ? 112 = *; vi = 1, V2=x, v 3 =*' 



Solution 

From the given formula for 7 we obtain 



T(im) = T(1) = (*)(1)=* 
T(n 2 ) = T(x) = (x)(x)=x 2 
By inspection, we can determine the coordinate vectors for 7*(ui) and T(ii 2 ) relative to B ! \ they are 



[T(m 2 )]b' = 



[T(mi)] B ' = 
Thus the matrix for T with respect to B and B f is 

[T] B ' B =[[T(ui)] B f\[T(u 2 )] B '] = 





1 
1 



EXAMPLE 2 Verifying Formula (4a) 



Let T:P\ * P 2 ^ e the linear transformation in Example 1. Show that the matrix 

(obtained in Example 1) satisfies 4a for every vector x = a + bx in p^. 



"0 


0" 


1 








1 



Solution 

Since x = p ( x ) = a _| bx, we have 

T(x) =xp(x) =ax ±bx 2 
For the bases B and B f in Example 1, it follows by inspection that 



Thus 



so 4a holds. 



[x] B = [ax + b] B = 



and [T(x)] B ' = [ax I bx z ] = 



[T] B ' tB [x] B 



\° °1 




[nl 




a 






1 u 


b 


^= 


a 


1 




b 



= [T(*)] B * 



EXAMPLE 3 Matrix for a Linear Transformation 



Let 7 : p 2 j, p 3 be the linear transformation defined by 

Find the matrix for the transformation T with respect to the bases B = {a\, \i 2 ) for R 2 and B 1 = {v\, v 2 , V3} for p}, where 



*1 
*2 



*2 
-5*1 + 13*2 
— lx\ 4- 16*2 



ui = 



"2 = 



vi = 



V2 = 



-1 
2 
2 



v 3 = 



Solution 



From the formula for T, 



7(ui) = 



1 

-2 
-5 



T(u 2 ) = 



Expressing these vectors as linear combinations of vi, V2, and V3, we obtain (verify) 

T(ui) = vi - 2v3, 7X112) = 3vi I V2 - V3 
Thus 



[T(m)] B > = 



1 



-2 



[T(u 2 )] B ' = 



3 
1 

-1 



so 



[T] B ',b= [[T(mi)] B '\[T(m 2 )]b'] = 



1 3 

1 

-2 -1 



EXAMPLE 4 Verifying Formula (5a) 



Let T: R 2 * J? 2 be the linear operator defined by 



and let B = {ui, 112) be the basis, where 



x 2 



ui = 



■2x\ +4^2 



"2 = 



(a) Find [T] B - 



(b) Verify that 5 a holds for every vector x in j? 2 . 



Solution (a) 

From the given formula for T, 



7X110 = 



= 2ui, 7'(u2) = 



= 3u2 



Therefore, 



[7Xui)] B = 



and [7*02)] B = 



Consequently, 



[T] B =[[T(u l )] B 



[T(u 2 )] B ] = 



2 
3 



Solution (b) 
if 



x = 



'1 

*2 



is any vector in ^, then from the given formula for T, 

T(x) = 



*l+*2 
-2xi I 4*2 



To find [ x ] % and [ T'(x) ] £ , we must express 6 and 7 as linear combinations of uj and 112- This yields the vector equations 





*2 


= *1 


"1" 
_1_ 


+ *2 


"1" 
_2_ 






X 

-2 


l+*2 
*l+4x 2 


= c 


•1 


"1" 
_1_ 


1 c 2 


"1" 
_2_ 



Equating corresponding entries yields the linear systems 

*1 + ^2 = *1 
k\ + 2£ 2 = *2 



and 



Solving 10 for ;t j and £ 2 yields 



so 



and solving 11 for c\ and C2 yields 



so 



^1+^2= *1 + *2 
c\ 4- 2^2 — — 2^i 4- 4^2 



(6) 

(7) 

(8) 
(9) 

(10) 
(11) 



2x\ —X2 
-xi I x 2 



ki = 2x\—X2, k 2 = -x\ I 12 
[x]* = 

c\=4x\-2x2, C2= -3xi I 3x2 
[7*0)] * = 



4xi -2x2 
-3xi I 3x2 



Thus 



so 5a holds. 



[T] B [x] B = 



2 
3 



2xi —*2 
-xi I x 2 



4xi -2x2 
3xi +3x2 



= [T(x)] B 



Matrices of Identity Operators 

The matrix for the identity operator on V always takes a special form. 



EXAMPLE 5 Matrices of Identity Operators 



If B = {in - u 2> - - -- u m ) * s a ^ as ^ s ^ or a finite-dimensional vector space V and /; y > y is the identity operator on V, then 

/(ui) =ui ? /(u 2 ) =u 2 , ---, /(u M ) =u„ 
Therefore, 



[I(pi)]b = 



[I<V2)] B = 



[/(*«)]* = 



Thus 



[/]£ = 



1 ■ 


■■ o" 


1 ■ 


- 


■ 


■ 


■ 


■■ 1 



= 1 



Consequently, the matrix of the identity operator with respect to any basis is the n x n identity matrix. This result could have 
been anticipated from Formula 5a, since the formula yields 

[/] B [x] B = [/(x)] B =[x] B 
which is consistent with the fact that [/] B = J. 

We leave it as an exercise to prove the following result. 

THEOREM 8.4.1 



IfT\R n * R m is a linear transformation, and ifB and B f are the standard bases for R n and R m , respectively, then 

[T] B ',B=[T] 



(12) 



This theorem tells us that in the special case where Tmaps R" into R™, the matrix for Twith respect to the standard bases is the 
standard matrix for T. In this special case, Formula 4a of this section reduces to 

[T]x=7(x) 

Why Matrices of Linear Transformations Are Important 

There are two primary reasons for studying matrices for general linear transformations, one theoretical and the other quite 
practical: 

* 

Answers to theoretical questions about the structure of general linear transformations on finite-dimensional vector spaces 
can often be obtained by studying just the matrix transformations. Such matters are considered in detail in more advanced 
linear algebra courses, but we will touch on them in later sections. 



These matrices make it possible to compute images of vectors using matrix multiplication. Such computations can be 
performed rapidly on computers. 



To focus on the latter idea, let T: V — > ffi be a linear transformation. As shown in Figure 8.4.5, the matrix [T] 2\b can be used 
to calculate T(x) in three steps by the following indirect procedure: 

1. Compute the coordinate vector [ x ] #• 



Direct 



■■:> 



computation 



-* 7Xx) 



f3> 



T Multiply by m 4 . M 

Figure 8.4.5 



2. Multiply [ X ] B on the left by [T]^^ to produce [7*(x)] 5 '. 



3. Reconstruct 7(x) from its coordinate vector [T(x)]g* 



EXAMPLE 6 Linear Operator on p 2 



Let X: P2 * ^2 ^ e ^ e ^ near operator defined by 

T{p{x))=p{3x-5) 

that is, T(c I ci* I c 2 x 2 ) =c 4- ci(3x - 5) I c 2 (3x-5) 2 - 

( a ) Find [7] ^ with respect to the basis B= l\,x,x t 

(b) Use the indirect procedure to compute 7(1 -f 2x + 3x ). 

(c) Check the result in (b) by computing 7(1 + 2x + 3x 2 ) directly. 



Solution (a) 

From the formula for 7, 



T(l) = 1, 7*00 =3x-5, T(x 2 ) = (3x-5) 2 = 9x 2 - 30* I 25 



so 



[7X1)] B = 



[7X*)]* = 



-5 
3 




[?*CO] B = 



25 

30 

9 



Thus 



1 


-5 


25 





3 


-30 








9 



[T] B = 



Solution (b) 

The coordinate vector relative to B for the vector p = 1 + 2x + 3x 2 is 



[P]fl = 



Thus, from 5a, 



[T(H-2^+3^)] B =[T(p)] 5 =[T] B [p] 5 



1 


-5 


25" 


V 




66" 





3 


-30 


2 


= 


-84 








9 


3 




27 



from which it follows that 

Solution (c) 

By direct computation, 



7(1 + 2x 4 3* 2 ) = 66- 84* I 27* 2 



7(1 +2x + 3x 2 ) = 1 + 2(3* -5) I 3(3* -5) 2 

= 1-1 6* -10 I 27x 2 - 90* 4- 75 
= 66-84* I 27* 2 



which agrees with the result in (b). 



Matrices of Compositions and Inverse Transformations 

We shall now mention two theorems that are generalizations of Formula 21 of Section 4.2 and Formula 1 of Section 4.3. The 
proofs are omitted. 



THEOREM 8.4.2 





IfTy.U 


_> y an d T 2 ' V 


_» jp are linear transformations, and ifB, B ff , and B* are bases for U, V, and W, respectively, 


then 




[^2°^l]>r,5= [T2]B\B n [ T l]B n f B (13) 



THEOREM 8.4.3 



IfT.V — ► V is a linear operator, and ifB is a basis for V, then the following are equivalent. 



(a) T is one-to-one. 




(b) [ T] B is invertible. 




Moreover, when these equivalent conditions hold, 




[T- X ] B =[T]^ 


(14) 



Remark In 13, observe how the interior subscript B ff (the basis for the intermediate space V) seems to "cancel out," leaving 
only the bases for the domain and image space of the composition as subscripts (Figure 8.4.6). This cancellation of interior 
subscripts suggests the following extension of Formula 13 to compositions of three linear transformations (Figure 8.4.7). 
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Figure 8.4.6 




liasi* ft u 



Basis £" 



foasi.t H' 



[T 3 oT 2 oTi] B ^ B = [T3] B * rB m[T2]B m ,B tt [ T l]B tt ,B 



(15) 



The following example illustrates Theorem 8.4.2. 



EXAMPLE 7 Using Theorem 8.4.2 



Let T\:P\ > P 2 ^ e the linear transformation defined by 

and let T 2 : P 2 > P2 ^ e t ' ie li near operator defined by 

T 2 (p(x))=p(3x-5) 
Then the composition (T 2 oTi):P\ — * P 2 is given by 

(7a o T { )(p(x)) = 72(7! (p (x))) = T 2 (xp(x)) = (3x - 5)p(3x - 5) 
Thus, if p(x) =cq-{- c\x, then 

(T 2 nT 1 )(c + c l x) = (3x-5)(cQ I c l (3x-5)) 

= cq(3x-5) I c { (3x-5) 2 (16) 

In this example, j p 1 plays the role of U in Theorem 8.4.2, and p 2 plays the roles of both V and W; thus we can take B f = B ff in 
13 so that the formula simplifies to 



Let us choose B = {l,x} to be the basis for p^ and choose B 1 = J 1 , x, x L to be the basis for p 2 . We showed in Exampl 
Example 1 and Example 6 that 



(17) 



[T\]b',b = 





1 
1 



and [T 2 ] B ' = 



1 


-5 


25" 





3 


-30 








9 



Thus it follows from 17 that 



[T 2 oTi] B t B = 



1 


-5 


25" 


"0 0" 







3 


-30 


1 


= 








9 


1 





-5 


25" 


3 


-30 





9 



(18) 



As a check, we will calculate [7*2 o T\ ] B ' B directly from Formula 4. Since B = { 1, x) , it follows from Formula 4 with Ul = ] 
and U2 = x that 



(7* 2 o 7*i)(l) =3* -5 and (T 2 o T]){x) = (3x - 5) 2 = 9x 2 - 30^ I 25 



Using 16 yields 

Since 5 = J 1, x, j: k, it follows from this that 

[(7*2070(1)]*' = 
Substituting in 19 yields 

[7*2o7*i]£' ;B = 

which agrees with 18. 



-5 
3 




and [ (7*2 oTi) (x)] B ' = 



-5 25 
3-30 
9 



25 

■30 

9 



(19) 



Exercise Set 8.4 



O 



Click here for Just Ask! 



1. 



Let T.P2 > P3 be the linear transformation defined by T(p(x)) = xp(x). 



(a) Find the matrix for T with respect to the standard bases 

S = Su\ , 112, 113 \ and 5' = Sv\ , ¥2, ¥3, ¥4^ 
where 

ui = l, 112=*, 113=* 

¥1 = 1, ¥ 2 =X ? ¥ 3 =X 2 ? ¥ 4 = X 3 

(b) Verify that the matrix [ T] g* f g obtained in part (a) satisfies Formula 4a for every vector x = eg + c i* I C2X 2 ^ n ^2* 

Let T.P2 — > Pi be the linear transformation defined by 

2 
T(flQ I a\x I (32^: ) = (flg "• fl l) — (2^1 4- 3^2)* 

W Find the matrix for 7 with respect to the standard bases S = i 1 , x, x I and B = j 1 , x X for p 2 and j p 1 . 

(b) Verify that the matrix [ T] s * fS obtained in part (a) satisfies Formula 4a for every vector x = eg + c i* | C2x 2 ^ n P2' 



Let 7; p 2 — > P 2 be the linear operator defined by 

2 2 

T(a\] I a]X I ajx )=an I fli fr — 1) H-a^fr — 1) 



w Find the matrix for T with respect to the standard basis 3 = i 1 , x, x > for p 2 . 



(b) Verify that the matrix [ 7*] £ obtained in part (a) satisfies Formula 5 a for every vector x = flo+aix + fl2^ 2 ^ n P2* 



4. 



Let 7 : R 2 » ^ 2 be the linear operator defined by 



*1 
*2 



xi I * 2 



and let B = {u\, 112 } be the basis for which 



111 = 



and ii2 = 



-1 




(a) Find [7] F 



(b) Verify that Formula 5a holds for every vector x in p 2 . 



Let T\R 2 * R^ be defined by 



T 



x\ 
*2 



-xi 




(a) Find the matrix [T] B ' S with respect to the bases B = {uj, 112} an d ^' = {vi, V2, V3V where 



"1 



"2 



vi 



V2 : 



V 3 : 



(b) Verify that Formula 4a holds for every vector 



x = 



x\ 
X2 



in r 2 . 



Let T\R 3 * R 3 be the linear operator defined by 



(a) Find the matrix for T with respect to the basis B = {v\, V2, V3} > where 



y, = (1,0,1), v, = (0,1,1), V3= (1,1,0) 



(b) Verify that Formula 5 a holds for every vector x=(;q ? ;t2,*3)i n i? 3 - 



(c) Is T one-to-one? If so, find the matrix of 7- 1 . 



7. 



Let X: P2 > ^2 ^ e ^ e ^ near operator defined by T(p (x) ) = p (2x I 1 ) — that is, 

r(co + cix-r-C2* 2 )=co+ci(2*-L. i) 1 C2 (2x 1 l) 2 



w Find [ 7] B with respect to the basis B = i 1 , x, x L 



(b) Use the indirect procedure illustrated in Figure 8.4.5 to compute 7(2 — 3* I 4x ). 



(c) Check the result obtained in part (b) by computing 7(2 — 3* I 4* ) directly. 



Let 7^2 > P3 be the linear transformation defined by T(p(x)) =xp(x — 3) — that is, 

T(cq I c\x I ^2^ )=*fco I ciC^-3) l c 2 (x-3) ) 



(a) 



Find [ T] g^g with respect to the bases 3 = i 1 , x, x I and 5=^1,*,*,* I 



(b) Use the indirect procedure illustrated in Figure 8.4.5 to compute 7(1 I x — x 2 ). 



(c) Check the result obtained in part (b) by computing 7(1 I x — x ) directly. 



<^ Let vi = 



and V2 = 



~-l" 


, and let 


4_ 






1 3~ 


A = 






-2 5_ 



be the matrix for T\R 2 > R 2 with respect to the basis B= {v\ ? V2} 



(a) Find [7(vi) ] B and [7(v 2 )] F 



(b) Find 7(vi) and 7(v 2 ). 



Find a formula for 7 



*1 
*2 



(d) 



Use the formula obtained in (c) to compute T\ 



10. Let ,4 = 



3-2 10' 
1 6 2 1 
-3 7 1 
B ! — |wi, w 2 , W3V where 



be the matrix of 7 : j? 4 ► J? 3 with respect to the bases B = {v\, v 2 , V3, v 4 } and 



vi = 



wi = 



V2 = 



""2 = 



2 

1 

-1 

-1 

"-7 
8 
1 



v 3 = 



w 3 = 



1 

4 

-1 

2 

-6 
9 
1 



v 4 = 



(a) Find [7(v!)] F s [T(v 2 )]^, [7(v 3 )] F s and [7(v 4 )] B ' 



(b) Find 7(vi), 7(v 2 ), 7(v 3 ), and 7(v 4 ). 



(c) 



Find a formula for 7 



*2 
x 4 



(d) 



Use the formula obtained in (c) to compute T 



11. Let A = 



1 3-1 

2 5 
6-2 4 



be the matrix of T:P 2 * P2 w * m res P ect t0 me basis B = {v\, v 2 , V3} » where Vl = 2x + 3x 2 > 



v 2 = - 1+ 3x 4- 2x 2 > v 3 = 3 + Ix 4- 2x 2 - 



(a) Find [7(vi)] F [T(v 2 )] F and [7(v 3 )] F 



(b) Find 7(vi),T(v 2 ), and 7(v 3 )- 



(c) Find a formula for T(an , a]X , fl ^ 2 )- 



(d) Use the formula obtained in (c) to compute T(\ I x). 



12. 



Let T\:P\ > .P 2 ^ e me nnear transformation defined by 

and let 7^ : f 2 > P2 ^ e me nnear operator defined by 

T 2 (p(x))=p(2x + 1) 

Let 5 = { 1 , x } and 5' = J 1 , x, x I be the standard bases for j p 1 and p 2 - 



13. 



(a) Find [ T 2 o 7^ ] ^ [ T 2 ] B *, and [ T { ] B ^. 

(b) State a formula relating the matrices in part (a). 

(c) Verify that the matrices in part (a) satisfy the formula you stated in part (b). 

Let T\ : P\ — > ^2 ^ e ^ e li near transformation defined by 

7*1 (eg I c\x) =2cq — 3c\x 

and let T 2 : -?2 — * ^3 ^ e ^ e li near transformation defined by 

2 2 3 

7*2(^0 + ci* -I- ^2* ) = 3cqX I 3^1* +3^2^ 

Let 5- {l ? x},5"=|l^^ 2 },and5'=|l^^ 2 ^ 3 }. 



(a) Find [ T 2 o T\ ] B * fB , [T 2 ] B \ B », and [ 7*i ] B » 5 . 



(b) State a formula relating the matrices in part (a). 

(c) Verify that the matrices in part (a) satisfy the formula you stated in part (b). 



Show that if T\ V — > W is the zero transformation, then the matrix of T with respect to any bases for V and W is a zero 

14. matrix. 

Show that if T: V — > V is a contraction or a dilation of V (Example 4 of Section 8.1), then the matrix of T with respect to 

15. any basis for V is a positive scalar multiple of the identity matrix. 

Let B = { vi , V2, V3, V4} be a basis for a vector space V. Find the matrix with respect to B of the linear operator 7* : y > pr 

I 6 - defined by 7(v0 = v 2 > F(v 2 ) = v 3 , 7*(v 3 ) = v 4 , 7(v 4 ) = vi- 

Prove that if B and 5' are the standard bases for R n and R m , respectively, then the matrix for a linear transformation 
l^ - T\R n — * R m with respect to the bases B and B f is the standard matrix for T. 



18. (For Readers Who Have Studied Calculus) 

Let D.P2 > ^2 ^ e ^ e differentiation operator Z)(p) = p*(x). In parts (a) and (b), find the matrix of £> with respect to the 

basis 5= {pi,p 2 ,P3}- 

(a) pi = l,P2 = *,p 3 = ;i: 2 

(b) Pl =2,p2 = 2-3;r,p 3 = 2-3jr I 8* 2 

(c) Use the matrix in part (a) to compute D(6 — 6x\- 24* 2 ). 

(d) Repeat the directions for part (c) for the matrix in part (b). 

19. (For Readers Who Have Studied Calculus) 

In each part, B= (f 1 , f 2, f 3 } i sa basis for a subspace V of the vector space of real- valued functions defined on the real 
line. Find the matrix with respect to B of the differentiation operator £)■ $r > p\ 

(a) f 1 — l, f 2 = sin*? f 3 = cos x 

(b) f l = \,f 2 = e x >h=* 2x 

(c) f 1= Af 2 =xAf 3 =iV J 



(d) Use the matrix in part (c) to compute D(4e I 8xe - 10* e ). 



Discussion 
Discov&ry 



Let V be a four-dimensional vector space with basis 5, let Wbe a seven-dimensional vector space 
20. w ith basis 5', and let T\ V — * W be a linear transformation. Identify the four vector spaces that 
contain the vectors at the corners of the accompanying diagram. 



Dircci 



m 



computaaon 



TTxJ 



(3> 



t Muttiply by FJT'| fl , a 

Mi- -^ *\mh 

Figure Ex-20 



21. 



In each part, fill in the missing part of the equation. 



(a) ? 

[T 2 oTi] B ' !B = [T 2 ] [T\] B ",b 



(b) ? 

[7 , 3o7 , 2oTi] B ' ;jB = [T 3 ] [T 2 ]b'",b"[ t i]b",B 



22. 



Give two reasons why matrices for general linear transformations are important. 
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8.5 

SIMILARITY 



The matrix of a linear operator T:V->V depends on the basis selected for V. 
One of the fundamental problems of linear algebra is to choose a basis for V 
that makes the matrix for T as simple as possible — a diagonal or a triangular 
matrix, for example. In this section we shall study this problem. 



Simple Matrices for Linear Operators 

Standard bases do not necessarily produce the simplest matrices for linear operators. For example, consider the linear operator 
7 ^2 , t g2 defined by 



and the standard basis B = { e i , e2 } f° r R 1 ^ where 



*1 
*2 



*1 



*1+ *2 
-2*i+ 4*2 



(1) 



*2 



By Theorem 8.4.1, the matrix for Twith respect to this basis is the standard matrix for T; that is, 
From 1, 



T( ei ) = 



1 
-2 



T(B 2 ) = 



so 



[T] B = 

In comparison, we showed in Example 4 of Section 8.4 that if 

1' 



ui = 



1 



1 1 

2 4 



"2 = 



then the matrix for T with respect to the basis B — hi\, 112} is the diagonal matrix 



2 
3 



(2) 



(3) 



(4) 



This matrix is "simpler" than 2 in the sense that diagonal matrices enjoy special properties that more general matrices do not. 

One of the major themes in more advanced linear algebra courses is to determine the "simplest possible form" that can be 
obtained for the matrix of a linear operator by choosing the basis appropriately. Sometimes it is possible to obtain a diagonal 
matrix (as above, for example); other times one must settle for a triangular matrix or some other form. We will be able only to 
touch on this important topic in this text. 

The problem of finding a basis that produces the simplest possible matrix for a linear operator T\ V — * V can be attacked by 
first finding a matrix for T relative to any basis, say a standard basis, where applicable, and then changing the basis in a manner 
that simplifies the matrix. Before pursuing this idea, it will be helpful to review some concepts about changing bases. 

Recall from Formula 6 in Section 6.5 that if the sets B= {ui , U2, . . ., u„ } and B f = jiij , u 2 , . . ., u^ X are bases for a vector space 
V, then the transition matrix from B f to B is given by the formula 



p=[K] b |[i4]bH^^] 

This matrix has the property that for every vector v in V, 



(5) 



^]tf=Ms (6) 

That is, multiplication by P maps the coordinate matrix for v relative to B f into the coordinate matrix for v relative to B [see 
Formula 5 in Section 6.5]. We showed in Theorem 6.5.4 that P is invertible and p -1 is the transition matrix from B to B f . 

The following theorem gives a useful alternative viewpoint about transition matrices; it shows that the transition matrix from a 
basis B f to a basis B can be regarded as the matrix of an identity operator. 

THEOREM 8.5.1 





IfB and B ! are bases for a finite-dimensional vector space V, and if J- Jf 


—*V is the identity operator, then the transition 


matrix from B f to B is [I] ^ ? B f - 



Proof Suppose that B= {\\\ , 112, - - ., u M } and B = |n 1 , u 2 , . . ., u„ X are bases for V. Using the fact that /( v ) = v for all v in V, it 
follows from Formula 4 of Section 8.4 with B and B f reversed that 

[/] ftB ' = [[/(iii)]B|[/C^)] B |-|[/Cni)] B ] 

Thus, from 5, we have U]B,B t = P, which shows that U]b,b* is the transition matrix from B f to B. 

■ 

The result in this theorem is illustrated in Figure 8.5.1. 




Figure 8.5.1 

U] B,B* is the transition matrix from B f to B. 

Effect of Changing Bases on Matrices of Linear Operators 

We are now ready to consider the main problem in this section. 

Problem If B and B f are two bases for a finite-dimensional vector space V, and if T\ V * V is a linear operator, what 

relationship, if any, exists between the matrices [T] B and [T] g*7 

The answer to this question can be obtained by considering the composition of the three linear operators on V pictured in Figure 
8.5.2. 




V V V V 

Ba&is = /J 1 h - i* - H Basis = B Basis = ff 



Figure 8.5.2 



In this figure, v is first mapped into itself by the identity operator, then v is mapped into T(y) by T , then T(y) is mapped into 
itself by the identity operator. All four vector spaces involved in the composition are the same (namely, V); however, the bases 
for the spaces vary. Since the starting vector is v and the final vector is 7*(v), the composition is the same as T; that is, 



T=IaToI 



(7) 



If, as illustrated in Figure 8.5.2, the first and last vector spaces are assigned the basis B f and the middle two spaces are assigned 
the basis B, then it follows from 7 and Formula 15 of Section 8.4 (with an appropriate adjustment in the names of the bases) that 



or, in simpler notation, 



IT] £',*' = U°T°I]B { ,B f = U]b\b[ T ]b,bU]b,B' 



[r] B *=[/wr] B [/w 



(8) 



(9) 



But it follows from Theorem 8.5.1 that [/] s 7 B : is the transition matrix from 5 1 to B and consequently, [/] s\B is the transition 
matrix from B to 5 ; . Thus, if we let P = [I] b 7 B'\ then P -1 = [/] s t g, so 9 can be written as 

[T] B '=P- 1 [T] B P 
In summary, we have the following theorem. 



THEOREM 8.5.2 



Let T: V > V be a linear operator on a finite-dimensional vector space V, and let B and B'* be bases for V . Then 



)-li 



[T\ B *=P- 1 [T] B P 
where P is the transition matrix from B f to B. 



(10) 



Warning When applying Theorem 8.5.2, it is easy to forget whether P is the transition matrix from B to B* (incorrect) or from 
B f to B (correct). As indicated in Figure 8.5.3, it may help to write 10 in form 9, keeping in mind that the three "interior" 
subscripts are the same and the two exterior subscripts are the same. Once you master the pattern shown in this figure, you need 
only remember that P = [I] b,b* is the transition matrix from B f to B and that P -1 = [/] B * B is its inverse. 
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Figure 8.5.3 



EXAMPLE 1 Using Theorem 8.5.2 



Let T:B? * R 2 be defined by 



T 



*1 
*2 



*1+ x 2 
-2*i + 4*2 



Find the matrix of T with respect to the standard basis £ = { e i , ?2 ) ^ or R 2, men use Theorem 8.5.2 to find the matrix of T with 
respect to the basis B — |n 1 , u 2 j, where 



u l = 



and ii2 = 



Solution 

We showed earlier in this section [see 2] that 



[T] B = 



1 1 
-2 4 



To find [T] #• from 10, we will need to find the transition matrix 

[see 5]. By inspection, 

u l =*i I *2 
ii2 =ei I 2e 2 



so 



[<] B = 



and [u 2 ] £ = 



Thus the transition matrix from B ' to B is 



The reader can check that 



P = 



P~ l = 



1 1 
1 2 



2 -1 
-1 1 



so by Theorem 8.5.2, the matrix of T relative to the basis B 1 is 



-li 



[T] B , = P-'[T] B P = 



2 


-1" 


1 f 


"i r 




"2 0" 


-1 


1_ 


-2 4_ 


1 2_ 




_0 3_ 



which agrees with 4. 



Similarity 

The relationship in Formula 10 is of such importance that there is some terminology associated with it. 



DEFINITION 



If A and B are square matrices, we say that B is similar to A if there is an invertible matrix P such that B = P AP- 



Remark It is left as an exercise to show that if a matrix B is similar to a matrix A, then necessarily A is similar to B. Therefore, 
we shall usually simply say that A and B are similar. 

Similarity Invariants 

Similar matrices often have properties in common; for example, if A and B are similar matrices, then A and B have the same 
determinant. To see that this is so, suppose that 



B = P~ X AP 



Then 



det(5) = det(P~ { AP) = det(F~ 1 )det( J 4)det(P) 
1 



det(P) 



■det(,4)det(P) = det( J 4) 



We make the following definition. 



DEFINITION 



A property of square matrices is said to be a similarity invariant or invariant under similarity if that property is shared by 
any two similar matrices. 



In the terminology of this definition, the determinant of a square matrix is a similarity invariant. Table 1 lists some other 
important similarity invariants. The proofs of some of the results in Table 1 are given in the exercises. 



Table 1 



Similarity Invariants 



Property 


Description 


Determinant 


A and p~^j\p have the same determinant. 


Invertibility 


A is invertible if and only if p _1 j±p is invertible. 


Rank 


A and p ~^j\p have the same rank. 


Nullity 


A and P~^AP have the same nullity. 


Trace 


A and p ~^j{p have the same trace. 


Characteristic 
polynomial 


A and p~ l AP have the same characteristic polynomial. 


Eigenvalues 


A and P~^AP have the same eigenvalues. 


Eigenspace 
dimension 


If X is an eigenvalue of A and P~^AP> then the eigenspace of A corresponding to X and the 
eigenspace of P~^AP corresponding to X have the same dimension. 



It follows from Theorem 8.5.2 that two matrices representing the same linear operator T: V > V with respect to different 



bases are similar. Thus, if B is a basis for V, and the matrix [T] s has some property that is invariant under similarity, then for 
every basis 5', the matrix [T] p -- has that same property. For example, for any two bases B and B f we must have 

det([T] B )=det([T] B 

It follows from this equation that the value of the determinant depends on T, but not on the particular basis that is used to obtain 
the matrix for T. Thus the determinant can be regarded as a property of the linear operator T\ indeed, if V is a finite-dimensional 
vector space, then we can define the determinant of the linear operator Tto be 



det(T) = det([T] B ) 



(11) 



where B is any basis for V. 



EXAMPLE 2 Determinant of a Linear Operator 



Let T:F? > R 2 be defined by 



Find det(7> 



Solution 



*2 



*1+ *2 
-2;q+ 4*2 



We can choose any basis B and calculate det( [T] r)- If we take the standard basis, then from Example 1, 



\X\b = 



1 1 

2 4 



so det(T) = 



1 1 

2 4 



= 6 



Had we chosen the basis B = |ui, 112} of Example 1, then we would have obtained 



[T] B ' = 
which agrees with the preceding computation. 



2 
3 



so det(T) = 



2 
3 



= 6 



EXAMPLE 3 Reflection About a Line 



Let / be the line in the *y-plane that passes through the origin and makes an angle 6 with the positive x-axis, where < 9 < it. As 
illustrated in Figure 8.5.4, let T: R 2 > R 2 be the linear operator that maps each vector into its reflection about the line /. 



Ux.}) 




Figure 8.5.4 



(a) Find the standard matrix for T. 



(b) Find the reflection of the vector x = ( 1 , 2) about the line / through the origin that makes an angle of 9 = ^ / 6 with the 
positive x-axis. 



Solution (a) 

We could proceed as in Example (Example 6) of Section 4.3 and try to construct the standard matrix from the formula 

[T] B =[T] = [T(< n )\T(e 2 )] 
where B= {ei, *2) * s ^ e stan dard basis for ^ 2 . However, it is easier to use a different strategy: Instead of finding [T] B 
directly, we shall first find the matrix [T] B *, where 

is the basis consisting of a unit vector u ' along / and a unit vector u ' perpendicular to / (Figure 8.5.5). 




Figure 8.5.5 



Once we have found \T] r', we shall perform a change of basis to find [7*] £ . The computations are as follows: 

T(xi[)=xi[ and T(^ 2 )=-xi! 2 
so 



[T<A)] B ' = 



and T[(\J 2 )] B < = 




-1 



Thus 



[T] B > = 



1 
-1 



From the computations in Example 6 of Section 6.5, the transition matrix from B 1 to B is 

cos# — sintf 
sin0 cos0 



p=[W l ] B \W 2 ] B ] = 



It follows from Formula 10 that 



[T] B = P[T] B *P 



-1 



Thus, from 12, the standard matrix for 7 is 



-1 



[T]=P[T] B *P- 1 = 



cos9 — sin0 


"1 0" 


cos 9 


sin 9 


sm9 cos9 


_0 -1_ 


— smO 


cos 9 


2 ■ 2 ■ 
cos 9 — sin 9 2sin0cos0 






2 sin0cos0 


sin 9 — c 


os 2 9 







cos 29 sin 20 
sin 29 — cos 29 



(12) 



Solution (b) 



It follows from part (a) that the formula for T in matrix notation is 

* \_ cos 28 sin 28 
t _ y \)~ [ sin 20 -cos 20 
Substituting f} = ^ / g in this formula yields 



so 



1 jg" 

2 2 


"x" 


£ -i 

2 2 


7 



T 



2 2 


"ll 


"I./3- 


^ i 


_ 2 J 


£-1 


2 2 




2 



Thus 7(1, 2) = l+fi r JL--i 



,2*2 



Eigenvalues of a Linear Operator 

Eigenvectors and eigenvalues can be defined for linear operators as well as matrices. A scalar X is called an eigenvalue of a 

linear operator T\ V * V if there is a nonzero vector x in V such that Tx = Ax- The vector x is called an eigenvector of T 

corresponding to X. Equivalently, the eigenvectors of T corresponding to X are the nonzero vectors in the kernel of \J _ 7 
(Exercise 15). This kernel is called the eigenspace of T corresponding to X. 



EXAMPLE 4 Eigenvalues of a Linear Operator 

Let p" = p ( — oo , oo ) and consider the linear operator 7 on V that maps / ( x ) to y (* _ 2tt) • If / (*) = sin(x) , then 
7(f) = sin(x — 2tt) = sin(x), so sin(x) is an eigenvector of T associated with the eigenvalue 1: 

T(sin(x)) = 1 - sin(x) 
Other eigenvectors of r associated with the eigenvalue 1 include sin(2*), cos (5x), an d the constant function 3. 

It can be shown that if V is a finite-dimensional vector space, and B is any basis for V, then 

The eigenvalues of T are the same as the eigenvalues of [ T] g- 

2. A vector x is an eigenvector of T corresponding to X if and only if its coordinate matrix [ x ] B is an eigenvector of [T] s 
corresponding to X. 

We omit the proofs. 



EXAMPLE 5 Eigenvalues and Bases for Eigenspaces 



Find the eigenvalues and bases for the eigenspaces of the linear operator X: P2 * P2 defined by 



T(a + bx + ex 2 ) = - 2c + (a 4 2b I c)x I (a I 3c)x 2 



Solution 

The matrix for T with respect to the standard basis B= i\,x,x I is 

[T] B = 



0-2 

1 2 1 
1 3 



(verify). The eigenvalues of T are A = 1 and A = 2 (Example 5 of Section 7.1). Also from that example, the eigenspace of [T] s 
corresponding to A = 2 has the basis { Uli u 2 } , where 



ui = 



-1 

1 



"2 = 



and the eigenspace of [7*] s corresponding to A = 1 has the basis (113} , where 

-2" 
1 



113 = 



1 



The matrices uj, 112, and 113 are the coordinate matrices relative to B of 

P!=-l + x, P2=*> 113= -2+x+x 
Thus the eigenspace of T corresponding to A = 2 has the basis 

and that corresponding to A = 1 has the basis 

As a check, the reader should use the given formula for Tto verify that T(pi) = 2pi> T(i>2) = 2p2' an< ^ 7Xl>3) = P3- 



EXAMPLE 6 Diagonal Matrix for a Linear Operator 



Let T: F? > B? be the linear operator given by 

/r*i]\ -2jt 3 

T *2 = x\ \ 2*2 + *3 

\[ x Aj [ x! 1 3*3 

Find a basis for j? 3 relative to which the matrix for Tis diagonal. 



Solution 

First we will find the standard matrix for T; then we will look for a change of basis that diagonalizes the standard matrix. 
If B = (ei, ^2> £3} denotes the standard basis for b?, then 



T(ei) = T 
so the standard matrix for T is 



T(b 2 ) = T 



1 


"0" 


\ 


"-2" 


r(e 3 )=T 





= 


1 


\ 


1 


J 


3 



[7*] = 



0-2 

1 2 1 
1 3 



(13) 



We now want to change from the standard basis B to a new basis B = |u 1 , u 2 , u 3 j in order to obtain a diagonal matrix for T. If 
we let P be the transition matrix from the unknown basis B f to the standard basis B, then by Theorem 8.5.2, the matrices [T] 
and [T] B * will be related by 

[T] B * = P- 1 [T]P (14) 

In Example 1 of Section 7.2, we found that the matrix in 13 is diagonalized by 

"-1 -2" 
P= 1 1 
1 1 

Since P represents the transition matrix from the basis B = /uj , u 2 , u 3 \ to the standard basis B = {e\, e 2 , 03} » me columns of 
/ > are[„{] F [ 1 ^] F and[ 1 ^] F so 



Mb = 



M S = 



Mb = 



Thus 



u'i = ( - l)ei I (0)e 2 I (1)* 3 = 



U2 = C0)oi+(l)*2 I (0)^3 = 



u^ = (-2) ei I (l)e 2 I (l)e 3 = 



-2 
1 
1 



are basis vectors that produce a diagonal matrix for [T] g. 



As a check, let us compute [T] g* directly. From the given formula for T, we have 



T(xi[) = 



-2 

2 



= 2u;, T(u 2 ) = 



= 2u 2 , T(u 3 ) = 



-2 
1 
1 



= U-j 



so that 



[7X«'l)] B ' = 



[7X« 2 )]*- = 



[7*CU3)]B' = 



Thus 



[^^^[[tcu;)]^^^)]^^^)]^]^ 



"2 





0" 





2 











1 



This is consistent with 14 since 



P~ l [T]P = 



1 

1 1 

-1 





1 2 
1 



-2 
1 
3 



-1 


-2 


1 


1 


1 


1 



"2 





0" 





2 











1 



We now see that the problem we studied in Section 7.2, that of diagonalizing a matrix A, may be viewed as the problem of 
finding a diagonal matrix D that is similar to A, or as the problem of finding a basis with respect to which the linear 
transformation defined by A is diagonal. 



Exercise Set 8.5 



® 



Click here for Just Ask! 



In Exercises 1-7 find the matrix of T with respect to the basis B, and use Theorem 8.5.2 to compute the matrix of T with respect 
to the basis B ! . 



T:R 2 > j? 2 is defined by 



*1 
*2 



*1 -2*2 
-*2 



B = {ui, "2) and B> = { v l> v 2}> where 



"1 



"2 



vi 



V2 : 



-3 
4 



7*-^2 t ^2 i s defined by 



*1 
*2 



*1 I 7*2 
3* 1-4*2 

5= {ui, "2} and 5 ' = { Vl ' V2 }' where 



111 = 



"2 = 



▼1 = 



V2 = 



-1 
-1 



3. 



7: $} > j? 2 is the rotation about the origin through 45°; B and B 1 are the bases in Exercise 1. 



T:B? * j? 3 is defined by 



T 



*1 
*2 
*3 



*1 4- 2^2 — ^3 
*1 I 7*3 



B is the standard basis for £ 3 and 5' = fv\, V2, V3 j, where 



vi 



V2 : 



V 3 : 



5. 



T:R 3 > R 3 is the orthogonal projection on the xy -plane; B and B f are as in Exercise 4. 



10. 



T:R 2 > R 2 is defined by T(x) = 5x; B and R f are the bases in Exercise 2. 

T\P\ — > P\ is defined by 7(a I ^l*) = ao + aiO 4- 1)' B= (pi, p 2 ) and 5' = {qi, q2}, where Pl = 6 | 3x, 
P2 = 10 -4 2x> f\\ = 2> q2 = 3 I 2*- 

Find det(7> 

(a) 7 : £ 2 — ..S 2 , where 7(^i^ 2 ) = (3^i -4^2, — jt i I 7* 2 ) 

(b) T:R 3 — > R 3 , where 7(*i, x 2 , xj) = (x\ - x 2 , *2 - *3, *3 - *i) 

(c) T:P 2 — >P 2 , where T(p(x))=p(x-l) 

Prove that the following are similarity invariants: 

(a) rank 

(b) nullity 

(c) invertibility 

Let 7 : p 4 > p A be the linear operator given by the formula T(p (x) ) = p (2x I 1 ) • 

(a) Find a matrix for T with respect to some convenient basis; then use Theorem 8.2.2 to find the rank and nullity of T. 

(b) Use the result in part (a) to determine whether T is one-to-one. 



11. 



In each part, find a basis for ^ 2 relative to which the matrix for T is diagonal. 



(a) 



*1 

*2 



2*i I 4*2 



(b) 



*1 
*2 



4xi -xa 
-3xi + *2 



12. 



In each part, find a basis for j? 3 relative to which the matrix for Tis diagonal. 



(a) 



r 


"*f 


\ 


T 


*2 


= 


L 


*3 


/ 



— 2^1 I *2~*3 

x\ -2* 2 -*3 

-x\ -x 2 -2x 3 



(b) 



/ 


~*i~ 


\ 


T 


*2 


= 


{ 


*3 


/ 



-*2+*3 

-*l+*3 

*l+*2 



(C) 



*1 
*2 
*3 



4jti I x 3 

2x\ I 3*2 I 2*3 
*l I 4*3 



13. 



Let 7; p 2 > P 2 ^ e defined by 

2 2 

T(a\] + a]X + a?x ) = (5an + 6a] 4- 2a^) — (a] I Sa^)x I (fln — 2a^)* 



(a) Find the eigenvalues of T. 



(b) Find bases for the eigenspaces of T. 



14. 



Let 7 : jy 22 * M22 ^ e defined by 



a b 
c d 



2c a \ c 
b — 2c d 



(a) Find the eigenvalues of T. 



(b) Find bases for the eigenspaces of T. 



Let X be an eigenvalue of a linear operator 7; y — > p\ Prove that the eigenvectors of T corresponding to X are the nonzero 
15. vectors in the kernel of \J _ 7. 



16. 



(a) Prove that if A and B are similar matrices, then j[ 2 and 5 2 are also similar. More generally, prove that A k and B k are 
similar, where k is any positive integer. 



(b) If A 2 and B 2 are similar, must A and B be similar? 



17. 



Let C and D be ^ x « matrices, and let 5 = { Vl r y 2? . . _ r y n } be a basis for a vector space V. Show that if C [x] g = D [x] j. 



for all x in V, then C = D 

Let / be a line in the ^-plane that passes through the origin and makes an angle 6 with the positive x-axis. As illustrated in 

18. the accompanying figure, let XR^ I i? 2 be the orthogonal projection of ^ 2 onto /. Use the method of Example 3 to show 

that 



Note See Example 6 of Section 4.3. 
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Figure Ex-18 



Discussion 
Discovery 



Indicate whether each statement is always true or sometimes false. Justify your answer by giving a 
19. logical argument or a counterexample. 



(a) A matrix cannot be similar to itself. 



(b) If A is similar to 5, and B is similar to C, then A is similar to C. 



(c) If A and B are similar and B is singular, then A is singular. 



(d) If A and B are invertible and similar, then ^4 _1 and B~ l are similar. 



20. 



Find two nonzero 2x2 matrices that are not similar, and explain why they are not. 



21. 



Complete the proof by filling in the blanks with an appropriate justification. 



Hypothesis: A and B are similar matrices. 

Conclusion: A and B have the same characteristic polynomial (and hence the same 
eigenvalues). 



Proof: (1) det(A/ - B) = det(AZ - P~ { AP) 



(2) 
(3) 



det(XP~ l P-P~ l AP) 



= det(p- { (M-A)P) 



(4) = detC-P _1 ) det(A/ - A) det (P) 

(5) = det(P _1 ) det(P) det(A/ - ^) 

(6) = det(A/ - j4) 

If A and 5 are similar matrices, say 5 _ p ~^j\p, then Exercise 21 shows that A and B have the 
22# same eigenvalues. Suppose that X is one of the common eigenvalues and x is a corresponding 
eigenvector for A. See if you can find an eigenvector of B corresponding to X, expressed in terms 
of X, x, and P. 

Since the standard basis for R n is so simple, why would one want to represent a linear operator on 
23. ^" in another basis? 

Characterize the eigenspace of A = 1 in Example 4. 
24. 



Prove that the trace is a similarity invariant. 
25. 
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8.6 



Our previous work shows that every real vector space of dimension n can be 
related to R n through coordinate vectors and that every linear transformation 
ISOMORPHISM from a real vector space of dimension n to one of dimension m can be related 

to R n and R m through transition matrices. In this section we shall further 
strengthen the connection between a real vector space of dimension n and R n . 



Onto Transformations 

Let V and W be real vector spaces. We say that the linear transformation X: V * W is onto if the range of T is W — that is, if for 

every w in W, there is a v in V such that 

T(v) =w 

An onto transformation is also said to be surjective or to be a surjection. For a surjective mapping, then, the range and the 
codomain coincide. 



EXAMPLE 1 Onto Transformations 

Consider the projection pp} > p 2 defined by P(% ? y ? z) = (x, y)- This is an onto mapping, because if w = (% ? y) is a point in 

p 2 , then v = (x, y, 0) is mapped to it. (Of course, so are infinitely many other points in p}.) 

Consider the transformation Q:R * R defined by P(x 7 y 7 z) = (x 7 y 7 0)- This is essentially the same as P except that we 

consider the result to be a vector in p^ rather than a vector in p 2 . This mapping is not onto, because, for example, the point (1,1, 
1) in the codomain is not the image of any v in the domain. 

If a transformation X: V * W is both one-to-one (also called injective or an injection) and onto, then it is a one-to-one mapping 

to its range Wand so has an inverse T~^\W * V- A transformation that is one-to-one and onto is also said to be bijective or to 

be a bijection between V and W. In the exercises, you'll be asked to show that the inverse of a bijection is also a bijection. 

In Section 8.3 it was stated that if V and Ware finite-dimensional vector spaces, then the dimension of the codomain Wmust be at 
least as large as the dimension of the domain V for there to exist a one-to-one linear transformation from V to W. That is, there 
can be an injective linear transformation from V to Wonly if djm(F r ) < dim(FF)- Similarly, there can be a surjective linear 
transformation from V to Wonly if dimff^) > dim (HP)- Theorem 8.6.1 follows immediately. 

THEOREM 8.6.1 



Bijective Linear Transformations 

Let V and W be finite-dimensional vector 
from V to W. 


spaces. 






then there 


can 


be 


no bijective 


linear transformation 


Iftim(F) 


#dim(FF> 







Isomorphisms 

Bijective linear transformations between vector spaces are sufficiently important that they have their own name. 



DEFINITION 



An isomorphism between V and Wis a bijective linear transformation from V to W. 



Note that if T is an isomorphism between V and W, then 7 -1 exists and is an isomorphism between W and V. For this reason, we 
say that V and Ware isomorphic if there is an isomorphism from V to W. The term isomorphic means "same shape," so 
isomorphic vector spaces have the same form or structure. 

Theorem 8.6.1 does not guarantee that if dkn(F) = dim(FF)* then there is an isomorphism from V to W. However, every real 
vector space V of dimension n admits at least one bijective linear transformation to R n \ the transformation T(v) = (v) s that takes 
a vector in V to its coordinate vector in R™ with respect to the standard basis for j?"|. 

THEOREM 8.6.2 



Isomorphism Theorem 

Let V be a finite-dimensional real vector space. Ifdtm(V) = k> then there is an isomorphism from V to R n . 



We leave the proof of Theorem 8.6.2 as an exercise. 



EXAMPLE 2 An Isomorphism between p 3 and fl 4 



The vector space p 3 is isomorphic to j? 4 , because the transformation 
is one-to-one, onto, and linear (verify). 



7(a + xb + ex 2 + dx 3 ) = (a, b, c, d) 



EXAMPLE 3 An Isomorphism between M22 ancl ^ 4 



The vector space ^2 i s isomorphic to ^ 4 , because the transformation 

a b 



c d 



= (a, b 9 c, d) 



is one-to-one, onto, and linear (verify). 



The significance of the Isomorphism Theorem is this: It is a formal statement of the fact, represented in Figure 8.4.5 and repeated 
here as Figure 8.6.1 for the case y = jp, that any computation involving a linear operator Ton Vis equivalent to a computation 
involving a linear operator on R n \ that is, any computation involving a linear operator on Vis equivalent to matrix multiplication. 
Operations on V are effectively the same as those on R n . 



[Jircel -. fc 

c<impuLatHHi 

1 



1 
i 



l*V 



Multiply by A 



mm* 



Figure 8.6.1 



If dim(F) = m» then we say that V and i?" have the same algebraic structure. This means that although the names conventionally 
given to the vectors and corresponding operations in V may differ from the corresponding traditional names in R n , as vector 
spaces they really are the same. 

Isomorphisms between Vector Spaces 

It is easy to show that compositions of bijective linear transformations are themselves bijective linear transformations. (See the 
exercises.) This leads to the following theorem. 



THEOREM 8.6.3 



Isomorphism of Finite-Dimensional Vector Spaces 

Let V and W be finite-dimensional vector spaces. Ifdka(V) = dim(£F)> then V and W are isomorphic. 



Proof We must show that there is an isomorphism from V to W. Let n be the common dimension of V and W. Then there is an 

isomorphism T: V * R n by Theorem 8.6.2. Similarly, there is an isomorphism S: W > R n - Let R — £ _1 . Then R 7- is an 

isomorphism from V to W, so V and Ware isomorphic. 



EXAMPLE 4 An Isomorphism between p 3 and M 2 2 



Because dimfT^) = 4 and dim(M22) = 4> these spaces are isomorphic. We can find an isomorphism T between them by 
identifying the natural bases for these spaces under T\P^ * M22' 

1 0" 



7(1) = 

T(x) = 

T(x 2 ) = 

T(x 3 ) = 



If p (x) = a + xh + ex 2 + dx 3 is in p 3 , then by linearity, 





1 




"0 


0" 


_1 


0_ 


"0 


0" 


_0 


1_ 



T{p{x))=a 



1 




+ b 



1 




I c 





1 



+ d 




1 



a b 
c d 



This is one-to-one and onto linear transformation (verify), so it is an isomorphism between p^ and M22* 

In the sense of isomorphism, then, there is only one real vector space of dimension n, with many different names. We take PJ 1 as 
the canonical example of a real vector space of dimension n because of the importance of coordinate vectors. Coordinate vectors 
are vectors in R n because they are the vectors of the coefficients in linear combinations 

and since our scalars a 2 are real, the coefficients (a\,..., a M ) are real ^-tuples. 

Think for a moment about the practical import of this result. If you want to program a computer to perform linear operations, 
such as the basic operations of the calculus on polynomials, you can do it using matrix multiplication. If you want to do video 
game graphics requiring rotations and reflections, you can do it using matrix multiplication. (Indeed, the special architectures of 
high-end video game consoles are designed to optimize the speed of matrix-matrix and matrix-vector calculations for computing 
new positions of objects and for lighting and rendering them. Supercomputer clusters have been created from these devices!) This 
is why every high-level computer programming language has facilities for arrays (vectors and matrices). Isomorphism ensures 
that any linear operation on vector spaces can be done using just those capabilities, and most operations of interest either will be 
linear or may be approximated by a linear operator. 



Exercise Set 8.6 



© 



Click here for Just Ask! 



Which of the transformations in Exercise 1 of Section 8.3 are onto? 



1. 



Let Abean^xw matrix. When is Tj±.R n > R n not onto? 



Which of the transformations in Exercise 3 of Section 8.3 are onto? 



Which of the transformations in Exercise 4 of Section 8.3 are onto? 



4. 



Which of the following transformations are bijections? 



(a) T:P 2 (x)^P 3 (k),T(P(x))=x P (k) 



(b) T . Ml2 ,jfcf 22 ,r( J 4)= J 4 3 



(c) T:R 4 >R*, T(x,y,z,w) = (x,y,0) 



(d) T_P 3 ► Rl T(a I hx I ex 2 I dx 3 ) = {h, c, d) 



Show that the inverse of a bijective transformation from V to Wis a bijective transformation from Wto V. Also, show that the 
6. inverse of a bijective /mrar transformation is a bijective /mrar transformation. 

Prove: There can be a surjective linear transformation from V to Wonly if dim^) > dim(^F)- 



8. 



(a) Find an isomorphism between the vector space of all 3x3 symmetric matrices and R^. 



(b) Find two different isomorphisms between the vector space of all 2 x 2 matrices and R^. 



(c) Find an isomorphism between the vector space of all polynomials of degree at most 3 such that ^(0) = and ^ 3 . 



(d) Find an isomorphism between the vector space span { 1, sin(;r), cos(x) } and R^. 



Let S be the standard basis for R n . Prove Theorem 8.6.2 by showing that the linear transformation T: V > R n that maps y gV 

9- to its coordinate vector ( v ) « in R n is an isomorphism. 



10. 



Show that if X\ 9 T2 are bijective linear transformations, then the composition 7 2 o T\ is a bijective linear transformation. 



11. (For Readers Who Have Studied Calculus) 

How could differentiation of functions in the vector space span { 1, sin(jr), cos (x) ? sin(2x) ? cos (2x) } be computed by matrix 
multiplication in R 5 7 Use your method to find the derivative of 3 _4 sin (x) I sin(2x) I 5cos(2x)- 



Discussion 
Discovery 



12. 



Isomorphisms preserve the algebraic structure of vector spaces. The geometric structure depends 
on notions of angle and distance and so, ultimately, on the inner product. If V and W are 

finite-dimensional inner product spaces, then we say that T: V * W is an inner product space 

isomorphism if it is an isomorphism between V and W, and furthermore, 

That is, the inner product of u and v in V is equal to the inner product of their images in W. 



(a) Prove that an inner product space isomorphism preserves angles and distances — that is, the 
angle between u and v in V is equal to the angle between T(\i) and T(v) in W, and 
||u-v|| r =||7(u)-r(v)||^ 



(b) Prove that such a T maps orthonormal sets in V to orthonormal sets in W. Is this true for an 
isomorphism in general? 



(c) Prove that if Wis Euclidean n-space and if dunff^) = n, then there is an inner product space 
isomorphism between V and W. 



(d) Use the result of part (c) to prove that if djm(F r ) = dim (HP)* then there is an inner product 
space isomorphism between V and W. 



(e) Find an inner product space isomorphism between p^ and M23' 
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Chapter 8 



Supplementary Exercises 



Let A be an K x n matrix, B a nonzero n x 1 matrix, and x a vector in R" expressed in matrix notation. Is T(x.) = Ax ^ B a 
*■• linear operator on R n 7 Justify your answer. 



Let 



2. 



A = 



cos 6 — sintf 
sin0 cos0 



(a) Show that 



A 2 = 



cos 2tf — sin 20 
sin 20 cos 2tf 



and 



A 2 = 



cos 30 — sin 30 
sin 30 cos 30 



3. 



(b) Guess the form of the matrix j[ n for any positive integer n. 

(c) By considering the geometric effect of f: R 2 > R 2 > where T is multiplication by A, obtain the result in (b) 

geometrically. 

Let vq be a fixed vector in an inner product space V, and let 7; jr * jr be defined by T(v) = fv, voJvq- Show that Tis a 

linear operator on V. 



Let vi, V2,. . ., v m be fixed vectors in R n , and let T.R* 2 — * R m be the function defined by T(x) = (x - vi, x - V2, ---, x - v m ) 
**• , where x - y 3 - is the Euclidean inner product on R n . 

(a) Show that T is a linear transformation. 

(b) Show that the matrix with row vectors vi, \- P 2, • • • , v m is the standard matrix for T. 

Let (e 1? *2 ? ?3, ^4} be the standard basis for j? 4 and let T:R 4 * i? 3 be the linear transformation for which 

5 * T(ei) = (1,2,1), T(e 2 ) = (0,1,0), 

r(. 3 ) = (1,3.0). 7(94) = (1.1.1). 



(a) Find bases for the range and kernel of T. 



(b) Find the rank and nullity of T. 



6. 



Suppose that vectors in j^ 3 are denoted by l x 3 matrices, and define X: R* * J? 3 by 



T([xi x 2 x 3 ])=[xi x 2 x 2 ] 



-1 


2 


4" 


3 





1 


2 


2 


5 



(a) Find a basis for the kernel of T. 



(b) Find a basis for the range of T. 



7. 



Let B = {v\, V2, V3 ? V4} be a basis for a vector space V, and let 7; Jf » f^ be the linear operator for which 

7( Vl ) = vi + V2 4- V3 4- 3\ r 4 
^( v 2)= v l~ v 2 I 2v 3 I 2v 4 
T(\ r 3) = 2vi — 4v2 4- 5v3 4- 3v4 
T(\ T 4) = — 2vi I &\ r 2 — 6V3 — 2\'4 



(a) Find the rank and nullity of T. 



(b) Determine whether 7 is one-to-one. 



Let V and Wbt vector spaces, let 7, 7 1? an d 72 ^ e ^ near transformations from V to W, and let k be a scalar. Define new 
8* transformations, 7 1 | 7 2 an d £7\ by the formulas 

(7 1 + r2)Cx) = r 1 (x) + r2(x) 
CtT)(x)= J tcrcx)) 



(a) Show that (7j | 7 2 ) : f — * W an d £7: f^ > JF are linear transformations. 



(b) Show that the set of all linear transformations from V to W with the operations in part (a) forms a vector space. 



Let A and B be similar matrices. Prove: 



(a) A T an d B T are similar. 



(b) If A and B are invertible, then ^ _1 and 5 -1 are similar. 



10. (Fredholm Alternative Theorem) 

Let T: V * V be a linear operator on an ^-dimensional vector space. Prove that exactly one of the following 

statements holds: 



(i) The equation T(x) = h has a solution for all vectors b in V. 



(ii) Nullity of 7 > Q. 



11. 



Let 7 : M 22 * M22 ^ e th e linear operator defined by 



T(X) = 



[1 1] 






[0 0] 




<v 


X 




[0 uj 






[1 lj 



12. 



Find the rank and nullity of T. 

Prove: If A and B are similar matrices, and if B and C are similar matrices, then A and C are similar matrices. 



13. 



14. 



Let i : ji/ 2 2 > ^22 ^ e ^ e hnear operator defined by L(M) = M T . Find the matrix for L with respect to the standard 

basis for if 22* 

Let B= (iii , ii2 - u 3 ) anc * ^ — { v l ' v 2> v 3 } be bases for a vector space V 9 and let 



P = 



be the transition matrix from B f to B. 



2 


-1 3~ 


1 


1 4 





1 2 



(a) Express vi, V2> V3 as linear combinations of ui, 112, 113. 



(b) Express m, 112, 113 as linear combinations of vi, \*2, V3. 



15. 



Let B — {ui , U2, U3 } be a basis for a vector space V, and let 7; f — » f^ be a linear operator such that 

~-3 4 7' 

[T\b= 1 -2 
1 

Find [T] £>, where B = |vi, V2, V3J is the basis for 1/ defined by 



vi = ui . V2 = ui + U2, V3 = ui 4- 112 + "3 



Show that the matrices 



16. 



1 1 
1 4 



and 



are similar but that 

3 1 
-6 -2 

are not. 



and 



2 1 
1 3 



-1 2 
1 



17. 



Suppose that T: V — * V is a linear operator and B is a basis for V such that for any vector x in V, 



[r(x)] fl = 



*1 -*2 I *3 

*2 

*l-*3 



if [x]ji = 



*1 
*3 



Find [7] F 



18. 



Let T: V * f be a linear operator. Prove that Tis one-to-one if and only if det(T) * 0- 



19. (For Readers Who Have Studied Calculus) 

(a) Show that if f = / (x), then the function D:C 2 {- oo , oo ) — ► F( - oo , oo ) defined by D(i ) = / "(*) is a 
linear transformation. 

(b) Find a basis for the kernel of D. 

(c) Show that the functions satisfying the equation £}(f ) — y (*) form a two-dimensional subspace of 
C ( — oo , oo ), and find a basis for this subspace. 



20. 



Let T:Pi ► J? 3 be the function defined by the formula 



n P (x)) = 



>(-!)" 


*(0) 


*(1) 



(a) FindT(* 2 I 5*1 6). 



(b) Show that T is a linear transformation. 



(c) Show that T is one-to-one. 



(d) Find 



,-l 




3 




(e) Sketch the graph of the polynomial in part (d). 



Let x\, x 2 , and x 3 be distinct real numbers such that x\ < X2 < ^3, and let T:P 2 
21* formula 



roc*)) = 



P(*3) 



r$ be the function defined by the 



(a) Show that T is a linear transformation. 



(b) Show that Tis one-to-one. 



(c) Verify that if a 1 , a 2 , and 33 are any real numbers, then 



,-1 



fl 1 

^2 
a 3 



= aiPi(x) I a 2 P 2 (x) Va z P 2 (x) 



where 



Pdx) = - 



(x-x 2 )(x-xj) 
C*l-^2)C*l-*3) 



f]W = 



(x-xi)(x-X2) 
C*2-^l)C*2-*3) ' 



P 3 (*) = - 



(x-xi)(x-x 2 ) 
C*3-^l)C*3-*2) 



(d) What relationship exists between the graph of the function 

a i P 1 (x)+a 2 P 2 (x) I a 3 P 3 (x) 

and the points (x u ai), {x 2 ,a 2 ), and (.x^aj)? 



22. (For Readers Who Have Studied Calculus) 

Let p(x) and q(x) be continuous functions, and let V be the subspace of C( — do , 4- do ) consisting of all twice 
differentiable functions. Define i; f^ > jr by 



(a) Show that L is a linear transformation. 



(b) Consider the special case where p( x ) = and q(x) = \. Show that the function ^(^) = ^sin* -h C2Cos x is in the 
nullspace of L for all real values of c \ and C2- 



23. (For Readers Who Have Studied Calculus) 

Let D:Pn * P n be the differentiation operator D(p) = p'. Show that the matrix for D with respect to the basis 

5= jl,*,* 2 , _..,*"} is 

^0100 - 

2 0-0 
3-0 

- n 
0-0 



24. (For Readers Who Have Studied Calculus) 

It can be shown that for any real number c, the vectors 



\,x-c, 



(x-c) 2 (*-c) H 



2! 



h! 



form a basis for p . Find the matrix for the differentiation operator of Exercise 23 with respect to this basis. 



25. (For Readers Who Have Studied Calculus) 

Let J-.Pn * P H _|_i be the integration transformation defined by 

-AlO = / C fl + fl i* H h a^x n )dx = a^x + ^-x H h fl " 

where p^^o + ^l^H h t3 H x". Find the matrix for T with respect to the standard bases for p^ and P H . ^ 



,H + 1 
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Chapter 8 



ffl Technology Exercises 



The following exercise is designed to be solved using a technology utility. Typically, this will be MATLAB, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear 
algebra capabilities. For this exercise you will need to read the relevant documentation for the particular utility you are using. The 
goal of this exercise is to provide you with a basic proficiency with your technology utility. Once you have mastered the 
techniques in this exercise, you will be able to use your technology utility to solve many of the problems in the regular exercise 
sets. 

Section 8.3 

Tl. (Transition Matrices) Use your technology utility to verify Formula (5). 

Section 8.5 

Tl. (Similarity Invariants) Choose a nonzero 3 x 3 matrix A and an invertible 3 x 3 matrix P. Compute p~^j{p and confirm 
the statements in Table 1 . 
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9 



CHAPTER 



Additional Topics 



INTRODUCTION: In this chapter we shall see how some of the topics that we have studied in earlier chapters can be 
applied to other areas of mathematics, such as differential equations, analytic geometry, curve fitting, and Fourier series. The 
chapter concludes by returning once again to the fundamental problem of solving systems of linear equations Ax = b This 
time we solve a system not by another elimination procedure but by factoring the coefficient matrix into two different triangular 
matrices. This is the method that is generally used in computer programs for solving linear systems in real-world applications. 
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9.1 



Many laws of physics, chemistry, biology, engineering, and economics are 
described in terms of differential equations — that is, equations involving 
APPLICATION TO functions and their derivatives. The purpose of this section is to illustrate 

DIFFERENTIAL one way in which linear algebra can be applied to certain systems of 

differential equations. The scope of this section is narrow, but it illustrates 
E Q U ATI O N S an important area of application of linear algebra. 



Terminology 

One of the simplest differential equations is 



y' = *y (i) 



where y = f (*) is an unknown function to be determined, y* = dy / dx is its derivative, and a is a constant. Like most 
differential equations, 1 has infinitely many solutions; they are the functions of the form 



y=ce ax 



(2) 



where c is an arbitrary constant. Each function of this form is a solution of y ' = ay since 

y f = cae a * = ay 

Conversely, every solution of y ' = ay must be a function of the form ce ax (Exercise 5), so 2 describes all solutions of 
y ' = ay. We call 2 the general solution of y ' = ay • 

Sometimes the physical problem that generates a differential equation imposes some added conditions that enable us to 
isolate one particular solution from the general solution. For example, if we require that the solution of y f = ay satisfy the 
added condition 

7(0) = 3 (3) 

that is, y = 3 when % — Q, then on substituting these values in the general solution y = ce a * we obtain a value for c — namely, 

3=ce°=c-Thus 

y = Ze a * 

is the only solution of y f = ay that satisfies the added condition. A condition such as 3, which specifies the value of the 
solution at a point, is called an initial condition, and the problem of solving a differential equation subject to an initial 
condition is called an initial-value problem. 

Linear Systems of First-Order Equations 

In this section we will be concerned with solving systems of differential equations having the form 

7i=aii7i i ^i2y2 + - + ^i^y H 
y f 2 = fl 2iyi + 322^2 + ■■■+ fl2^y H 

74 = fl HLVi i ^2^2 + - + ^H^y H 

where y ^ = f 1 (^), y 2 = / 2OO* --•' 7n = /mOO are f unc ti° ns to be determined, and the a^s are constants. In matrix 
notation, 4 can be written as 



y\ 
y\ 



y'n 



or, more briefly, 



a n a n 
a n \ a n2 



y f = Ay 



a ln~ 


>r 


&2n 


72 


ct nn 


yn 



EXAMPLE 1 Solution of a System with Initial Conditions 



(a) Write the following system in matrix form: 



7i = 

y' 2 = 
y f 3 = 



lyi 
272 

^73 



(b) Solve the system. 



(c) Find a solution of the system that satisfies the initial conditions y j (0) = 1, y2(0) = 4' anc * ^3(0) = — 2- 



Solution (a) 



y\ 
y 2 
y'z 



3 


0" 


~7l~ 





-2 


yi 





5 


73 



(5) 



or 



y' = 



3 


0" 





-2 





5 



V 



Solution (b) 

Because each equation involves only one unknown function, we can solve the equations individually. From 2, we obtain 

.3* 



72 = ^ 



-2* 



y3=c3^ 



Jx 



or, in matrix notation, 



y = 



"71 ~ 




c x e 2x 


72 


= 


c 2 e~ 2x 


73 




c 3 e 5x 



(6) 



Solution (c) 

From the given initial conditions, we obtain 



l=7l(0)=cie =ci 

4=72(0) = c 2 e° = c 2 

-2=73(0) = c 3 e° = c 3 



so the solution satisfying the initial conditions is 



yi=e 3 *, 72 = 4e 2 \ 



73 = 



= -2s 



5x 



or, in matrix notation, 



~7l~ 




e 3 *~ 


72 


= 


Ae~ 2x 


73 




_-2e 5x _ 



The system in the preceding example is easy to solve because each equation involves only one unknown function, and this is 
the case because the matrix of coefficients for the system in 5 is diagonal. But how do we handle a system y' = Ay in which 

the matrix A is not diagonal? The idea is simple: Try to make a substitution for y that will yield a new system with a diagonal 
coefficient matrix; solve this new simpler system, and then use this solution to determine the solution of the original system. 



The kind of substitution we have in mind is 



yi=PHUi + pi 2 u 2 + - + PlnUn 



or, in matrix notation, 



71 




^n vn ■ 


" Pin 


HI 






72 


= 


P21 P22 " 


- P2n 


"2 


or, more briefly, 


y = Pu 


7h 




Pn\ Pn2 ■ 


" Pnn 


«H 







(7) 



In this substitution, the p^'s are constants to be determined in such a way that the new system involving the unknown 
functions u\ 9 U2 9 ••-,u n has a diagonal coefficient matrix. We leave it for the reader to differentiate each equation in 7 and 
deduce 

y' = Ai' 

If we make the substitutions y = Ai and y ' = Pn f in the original system 

Y f = Ay 



and if we assume P to be invertible, then we obtain 



Pu^AiPu) 



or 



u* = (P~ l AF)u or n f =Da 
where q = p-lj±p. The choice for P is now clear; if we want the new coefficient matrix D to be diagonal, we must choose P 



to be a matrix that diagonalizes A. 

Solution by Diagonalization 

The preceding discussion suggests the following procedure for solving a system y' = Ay with a diagonalizable coefficient 
matrix A. 



Step 1. Find a matrix P that diagonalizes A. 



Step 2. Make the substitutions y = Ai and y = Ai to obtain a new "diagonal system" u ; = Dii, where D = P ~ l AP- 



Step 3. Solve u 1 = Da. 



Step 4. Determine j from the equation y = Ai. 



EXAMPLE 2 Solution Using Diagonalization 



(a) Solve the system 

y[= y\+ 72 

(b) Find the solution that satisfies the initial conditions y j (0) = 1, y 2 (0) = 6- 



Solution (a) 

The coefficient matrix for the system is 



,4 = 



1 1 
4 -2 



As discussed in Section 7.2, A will be diagonalized by any matrix P whose columns are linearly independent eigenvectors of 
A. Since 



det(M-A) = 
the eigenvalues of A are \ = 2, A = — 3- By definition, 



A-l -1 
-4 A+2 



= A^ + A-6 = (A+3)(A-2) 



x = 



*1 
^2 



is an eigenvector of A corresponding to A if and only if x is a nontrivial solution of (A/ — A)x = — that is, of 



A-l -1 


*1 







-4 A+2_ 


/2_ 




_0_ 



If A = 2, this system becomes 



Solving this system yields x ^ — t, xj — 1> so 



Thus 



1 -1 
-4 4 



*l 
*2 



~*l" 

*2 


= 


t 


= t 


"f 
_1_ 



Pl = 



is a basis for the eigenspace corresponding to \ = 2- Similarly, the reader can show that 

P2 = 
is a basis for the eigenspace corresponding to \ = _ 3. Thus 



diagonalizes A, and 

Therefore, the substitution 
yields the new "diagonal system" 

From 2 the solution of this system is 



P = 



D = P~ l AP = 



~1 


1 " 

4 


_1 


1_ 



2 
-3 



y = Pu and y' = P\\ f 



u f =Du = 



U\ =C\€ 



2 
-3 

.2x 



11 or 



u f 2 = - 3u 2 



u 2 =c 2 e 
so the equation y = Ai yields, as the solution forj, 



-3* 



or u = 



c\e 



Jx 



c 2 e 



-3x 



y = 



71 

72 



1 -i 

1 1 



c\e 



.2x 



c 2 e 



-3x 



c\e 2x + c 2 e~ 3x 



or 



y 2 = cie 2x + c 2 e~ 3x 

Solution (b) 

If we substitute the given initial conditions in 8, we obtain 

CI I C7 = 6 

Solving this system, we obtain c ^ = 2, c 2 = 4> so f rom 8, the solution satisfying the initial conditions is 



(8) 



7l = 2e 2 *- e~ 3x 

y 2 = 2e 2x + 4e- 3x 

♦ 

We have assumed in this section that the coefficient matrix of y' = Ay is diagonalizable. If this is not the case, other methods 
must be used to solve the system. Such methods are discussed in more advanced texts. 



Exercise Set 9.1 



Click here for Just Ask! 



1. 



2. 



3. 



(a) Solve the system 

y[ = y\+4y2 

y 2 =2yi+3y 2 

(b) Find the solution that satisfies the initial conditions y j (0) = 0, y 2 (0) = 0- 



(a) Solve the system 



y'\ = yi + 3yi 
(b) Find the solution that satisfies the conditions y j (0) = 2, yi (0) = 1- 



4. 



(a) Solve the system 

y[ = 4yi + 73 

72= -2y\ I y 2 

y' 2 = -7y\ +73 

(b) Find the solution that satisfies the initial conditions y j (0) = — 1, y 2 (0) = 1> y3(0) = ID- 
Solve the system 



72 = 2^1 +4^2 I 273 
y' 3 = 2y\ + 2y 2 + 4^ 3 



5. 



Show that every solution of y ' = ay has the form y — ce . 



Hint Let y — f (*) be a solution of the equation, and show that / (x)e is constant. 



Show that if A is diagonalizable and 

>1 

yi 
yn 

satisfies y' = j4y, then each y,- is a linear combination of e A i*, e A 2*, ..., s ^» x where Aj, A2> • • •» A H are the eigenvalues of 
A. 



It is possible to solve a single differential equation by expressing the equation as a system and then using the method of 
7. this section. For the differential equation y ' — y —6y = 0, show that the substitutions y\=y and y 2 = y' l ea d to the 
system 

y[ =yi 

y' 2 =6y l +y 2 
Solve this system and then solve the original differential equation. 

Use the procedure in Exercise 7 to solve y +,y— 12y = 0. 



9. 



Discuss: How can the procedure in Exercise 7 be used to solve y"' — 6y" I 1 \y' — 6y = 0l Carry out your ideas. 



Discussion 
Discovery 



10. 



(a) By rewriting 8 in matrix form, show that the solution of the system in Example 2 can 
be expressed as 



y = c\e 2x 


T 
_1_ 


+ c 2 e~ 3x 


1 " 

4 
1_ 



This is called the general solution of the system. 

(b) Note that in part (a), the vector in the first term is an eigenvector corresponding to the 
eigenvalue Aj = 2 an d the vector in the second term is an eigenvector corresponding t( 
the eigenvalue \ 2 = — 3. This is a special case of the following general result: 



THEOREM 



If the coefficient matrix A of the system y f = Ay in 4 is diagonalizable, then the 
general solution of the system can be expressed as 

Y = c i s XlX x l I c 2 e X2 *x 2 + ~ + c n e X " x x n 

where Ai, \ 2 , ■■■, A H are the eigenvalues of A, and x 2 is an 
eigenvector of A corresponding to A 2 . 



p rove this result by tracing through the four-step procedure discussed in the section 
with 





Ai ■ 


-■ " 






D = 


A 2 ■ 
■ 


-■ 


and 


P= [xi|x 2 |-|x H ] 



Consider the system of differential equations y 1 = Ay where A is a 2 x 2 matrix. For what 
11* values of a^ a\ 2 , a 2 \, a 22 do the component solutions y\(t), y 2 (t) ten d to zero as t 



* DO 



12. 



? In particular, what must be true about the determinant and the trace of A for this to happen? 
Solve the nondiagonalizable system y^=y^ \ y 2 , y f 2 = y 2 - 



Use diagonalization to solve the system y f ^ = 2y\ \ y 2 \ Ly f 2 = y\ \ 2y 2 I 2^ by first 
13. writing it in the form y' = ^4y 4- f . Note the presence of a forcing function in each equation. 

Use diagonalization to solve the system y* = y^ + y 2 -f e f , y f 2 =y\ —y 2 I e~ f ^y first 
14# writing it in the form y' = Ay + f . Note the presence of a forcing function in each equation. 
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9.2 

GEOMETRY OF LINEAR 
OPERATORS ON R 2 



In Section 4.2 we studied some of the geometric properties of linear operators 
on R 2 and R 3 . In this section we shall study linear operators on R 2 in a little 
more depth. Some of the ideas that will be developed here have important 
applications to the field of computer graphics. 



Vectors or Points 

If T: R 2 * J? 2 is the matrix operator whose standard matrix is 

A = 
then 

T 



a b 
c d 



~x~ 

y 


)- 


a b 
c d 


~x~ 

y 


= 


ax 4- by 
ex \-dy 



(1) 



There are two equally good geometric interpretations of this formula. We may view the entries in the matrices 

ax +by 



and 



ex ~\-dy 



either as components of vectors or as coordinates of points. With the first interpretation, Tmaps arrows to arrows, and with the 
second, points to points (Figure 9.2.1). The choice is a matter of taste. 




(a) 7 maps vectors to vectors 



V \ 



(ai+ivi c.r+ tiy) 



hfafl 






(/)) 7* maps points to points 
Figure 9.2.1 

In this section we shall view linear operators on ^ 2 as mapping points to points. One useful device for visualizing the behavior of a 
linear operator is to observe its effect on the points of simple figures in the plane. For example, Table 1 shows the effect of some 
basic linear operators on a unit square that has been partially colored. 



Table 1 



Operator 



Standard Matrix 



Effect on the Unit Square 



Reflection about the y-axis 



1 
1 


¥ 


n 


.i) 


[-1, 

X 

— *- 


) 


l' 








/ 




\ 


X 



Reflection about the x-axis 



1 
-1 



t' v 












(1 


,1) 






/ 




X 













X 



{L-U 



Reflection about the line y = x 



1 

1 



*- v 



(1,1) 



X 



4- v 



■ I. 3i 



Counterclockwise rotation through an angle 9 



cos0 — sm.0 
smO cos 9 



+ '■" 



(con - sin G, sin + cosfl J 

ft 



tL s&^ 



In Section 4.2 we discussed reflections, projections, rotations, contractions, and dilations of R 2 . We shall now consider some other 
basic linear operators on ^ 2 . 

Compressions and Expansions 

If the x-coordinate of each point in the plane is multiplied by a positive constant k, then the effect is to compress or expand each 
plane figure in the x-direction. If < k < h the result is a compression, and if k > 1, it is an expansion (Figure 9.2.2). We call such 
an operator a compression (or an expansion) in the x-direction with factor k. Similarly, if the y-coordinate of each point is 
multiplied by a positive constant k, we obtain a compression (or expansion) in the y-direction with factor k. It can be shown that 
compressions and expansions along the coordinate axes are linear transformations. 







in) (Unil square) 
Figure 9.2.2 



{(>} (Compression) A ;= ~ 



(c) (Expansion) k = 2 



If T\ R 2 * R 2 is a compression or expansion in the x-direction with factor k, then 



7*(ei) = 7- 



so the standard matrix for T is 



1 

0_ 


)- 


k 
_0_ 


- 




~k 








1 



7 , (e 2 ) = T 



Similarly, the standard matrix for a compression or expansion in the j-direction is 

1 0" 
k 



EXAMPLE 1 Operating with Diagonal Matrices 



Suppose that the xy -plane first is compressed or expanded by a factor of £j in the x-direction and then is compressed or expanded 
by a factor of £ 2 m the j-direction. Find a single matrix operator that performs both operations. 



Solution 

The standard matrices for the two operations are 



*i 
1 



1 
k 2 



x — compression e]q>ajision i >' compression; expansion) 

Thus the standard matrix for the composition of the x-operation followed by the j-operation is 

This shows that multiplication by a diagonal 2x2 matrix compresses or expands the plane in the x-direction and also in the 
y-direction. In the special case where k\ and £ 2 are the same, say k\=k2 = k» note that 2 simplifies to 

'k 0" 



"1 " 


"*i 0" 




>l " 


k 2 _ 


1 




k 2 



(2) 



,4 = 



k 



which is a contraction or a dilation (Table 8 of Section 4.2). 

Shears 

A shear in the x-direction with factor A: is a transformation that moves each point {x,y) parallel to the x-axis by an amount ky to 
the new position ( x \ ky, y)- Under such a transformation, points on the x-axis are unmoved since y = 0- However, as we 
progress away from the x-axis, the magnitude of y increases, so points farther from the x-axis move a greater distance than those 
closer (Figure 9.2.3). 




(*> 



t+kyi v) 




ib) Shear inA-dircvLion 
with factor k : > 



t\ +k\\ v) 




(rj Sheur in v-dirvclion 
with faeior k < 

Figure 9.2.3 



A shear in the y -direction with factor A: is a transformation that moves each point (x,y) parallel to the j-axis by an amount fc x to 
the new position ( x? y \ fee). Under such a transformation, points on the y-axis remain fixed, and points farther from the j-axis 
move a greater distance than those that are closer. 

It can be shown that shears are linear transformations. If X: R 2 * R 2 isa shear with factor k in the x-direction, then 



T(ei) = r 



so the standard matrix for T is 



r 

0_ 




t 

_0_ 


* 




"1 k 








1 



T(b 2 ) = T 



Similarly, the standard matrix for a shear in the j-direction with factor k is 

1 0" 
k 1 



Remark Multiplication by the 2 x 2 identity matrix is the identity operator on gA. This operator can be viewed as a rotation 
through 0°, or as a shear along either axis with £ = Q, or as a compression or expansion along either axis with factor fc = 1- 



EXAMPLE 2 Finding Matrix Transformations 



(a) Find a matrix transformation from p 2 to p 2 that first shears by a factor of 2 in the x-direction and then reflects about y =x. 

(b) Find a matrix transformation from R 2 top 2 that first reflects about y = x and then shears by a factor of 2 in the x-direction. 



Solution (a) 

The standard matrix for the shear is 
and for the reflection is 



Ai = 



A 2 = 



1 2 
1 



1 

1 



Thus the standard matrix for the shear followed by the reflection is 

A 2 A X = 

Solution (b) 

The reflection followed by the shear is represented by 

AiA 2 = 



"o r 


"1 2" 




"0 f 


_1 0_ 


1_ 




1 2_ 



"1 2" 


"0 f 




"2 r 


1_ 


_1 0_ 




1 0_ 



In the last example, note that A\A2± ^2^V so ^ e e ^ ect °f shearing and then reflecting is different from the effect of reflecting and 
then shearing. This is illustrated geometrically in Figure 9.2.4, where we show the effects of the transformations on a unit square. 



7 


\ d.j) 


x 








*- 



/ 




/ 

(l.ll 



$M 



/> 



(3,1) 




FUs flection 
abom v = Jt 



i> 



Sll^ll III [IlL 1 

.?-■ JarcHciion 

with k = 2 



ia} 



i\ T D 







Shttirin tfic 
^-direction 
wtfh A = 2 



(h) 



Figure 9.2.4 



EXAMPLE 3 Transformations Using Elementary Matrices 



Show that if T: R 2 * j? 2 is multiplication by an elementary matrix, then the transformation is one of the following: 



(a) a shear along a coordinate axis 



(b) a reflection about y = x 



(c) a compression along a coordinate axis 



(d) an expansion along a coordinate axis 



(e) a reflection about a coordinate axis 



(f) a compression or expansion along a coordinate axis followed by a reflection about a coordinate axis. 



Solution 



Because a 2 x 2 elementary matrix results from performing a single elementary row operation on the 2x2 identity matrix, it must 
have one of the following forms (verify): 



"1 0" 




"l k 




"0 f 




~k 0" 




"l 0" 


_k 1_ 


* 


1_ 


* 


1 0_ 


* 


1_ 


* 


_0 k_ 



The first two matrices represent shears along coordinate axes; the third represents a reflection about y = x. If k > Q> the last two 
matrices represent compressions or expansions along coordinate axes, depending on whether 0<£<lor£>l.If£<Q, and if we 
express k in the form £ — _ £ 1? where ^ > o, then the last two matrices can be written as 

(3) 



~k 0" 




' -k\ 0" 




"-1 0" 


>l 0" 


1_ 




1_ 




1_ 


1_ 


"1 0" 




"1 0" 




"1 0" 


"1 " 


k_ 




-k\ 




_0 -1_ 


k\ 



(4) 

Since k\ > 0> the product in 3 represents a compression or expansion along the x-axis followed by a reflection about the y-axis, and 
4 represents a compression or expansion along the y-axis followed by a reflection about the x-axis. In the case where fc = _ 1, 
transformations 3 and 4 are simply reflections about the y-axis and x-axis, respectively. 

♦ 

Reflections, rotations, compressions, expansions, and shears are all one-to-one linear operators. This is evident geometrically, 
since all of those operators map distinct points into distinct points. This can also be checked algebraically by verifying that the 
standard matrices for those operators are invertible. 



EXAMPLE 4 A Transformation and Its Inverse 

It is intuitively clear that if we compress the xy-plane by a factor of ^ in the ^-direction, then we must expand the xy-plane by a 
factor of 2 in the y-direction to move each point back to its original position. This is indeed the case, since 

1 0" 



,4 = 



° \ 



represents a compression of factor -i in the j-direction, and 



A~ { = 



1 
2 



is an expansion of factor 2 in the ^-direction. 

Geometric Properties of Linear Operators on R 2 

We conclude this section with two theorems that provide some insight into the geometric properties of linear operators on j? 2 . 



THEOREM 9.2.1 



IfT-.R 2, * i? 2 is multiplication by an invertible matrix A, then the geometric effect ofTis the same as an appropriate 

succession of shears, compressions, expansions, and reflections. 



Proof Since A is invertible, it can be reduced to the identity by a finite sequence of elementary row operations. An elementary 
row operation can be performed by multiplying on the left by an elementary matrix, and so there exist elementary matrices g^ ^ 



..., Efc such that 

Solving for A yields 
or, equivalently, 



E k .-E 2 EiA = l 



A = E^E^-E^I 



A = E^E^-E^ 



(5) 



This equation expresses A as a product of elementary matrices (since the inverse of an elementary 
matrix is also elementary by Theorem 1.5.2). The result now follows from Example 3. 



EXAMPLE 5 Geometric Effect of Multiplication by a Matrix 



Assuming that ^ and £ 2 are positive, express the diagonal matrix 

>i 
k 2 



,4 = 



as a product of elementary matrices, and describe the geometric effect of multiplication by A in terms of compressions and 
expansions. 

Solution 

From Example 1 we have 

,4 = 

which shows that multiplication by A has the geometric effect of compressing or expanding by a factor of ^ in the x-direction and 
then compressing or expanding by a factor of £ 2 in the y-direction. 



>i " 




"1 " 


"*i 0" 


k 2 




k 2 


1_ 



EXAMPLE 6 Analyzing the Geometric Effect of a Matrix Operator 



Express 



,4 = 



1 2 
3 4 



as a product of elementary matrices, and then describe the geometric effect of multiplication by A in terms of shears, compressions, 
expansions, and reflections. 



Solution 

A can be reduced to / as follows: 



1 2 




1 


2 




1 2 




1 


|_3 4 J 




[0 


-2J 




[0 lj 




[0 lj 



T T T 

Add — 3 times Multiply the Add — 2 times 
the first row second row the second row 
to the second. by — ^ to the first. 

The three successive row operations can be performed by multiplying on the left successively by 



£1 = 



Inverting these matrices and using 5 yields 



1 
-3 1 



E 2 = 



1 

-1 



E 3 = 



1 



A. — Ai ^2 ^2 



"1 0" 


"1 


0" 


"1 2" 


3 1_ 





-2_ 


_0 1_ 



"1 


0" 




"1 


0" 


"1 0" 


_0 


-2_ 







-1_ 


_0 2_ 



Reading from right to left and noting that 



it follows that the effect of multiplying by A is equivalent to 



1. shearing by a factor of 2 in the x-direction, 



2. then expanding by a factor of 2 in the ^-direction, 



3. then reflecting about the x-axis, 



4. then shearing by a factor of 3 in the ^-direction. 



The proofs for parts of the following theorem are discussed in the exercises. 



THEOREM 9.2.2 



Images of Lines 








JfT.R 


2 j p} is multiplication by an invertible matrix, then 


(a) 


The image of a straight line is a straight line. 


(b) 


The image of a straight line through the origin is a straight line 


through the origin. 






(c) 


The images of parallel straight lines are parallel straight lines. 








(d) 


The image of the line segment joining points P and Q is the line 


segment joining the 


images 


ofP and Q. 



(e) The images of three points lie on a line if and only if the points themselves lie on some line. 



Remark It follows from parts (c), (d), and (e) that multiplication by an invertible 2x2 matrix A maps triangles into triangles and 
parallelograms into parallelograms. 



EXAMPLE 7 Image of a Square 



The square with vertices Pi(0, 0), f^O. 0)> ^3(1, 1)' an d ^4(0, 1) is called the unit square. Sketch the image of the unit square 
under multiplication by 

~-l 2 
2 -1 



A = 



Solution 

Since 



-1 2 

2 -1 

-1 2 

2 -1 



To" 




"0" 




L°. 




_o_ 




To" 




2" 




[i_ 




_-!_ 





-1 2 

2 -1 

-1 2 

2 -1 



rr 




"-1" 


[o_ 




2_ 


rr 




"1" 




[i_ 




1 





the image of the square is a parallelogram with vertices (0, 0), (-1, 2), (2, -1), and (1, 1) (Figure 9.2.5). 

(0. 1 1 



(0.0) 



fi.n 



11. UJ 






(*/) 



(-1.2) i> 




EXAMPLE 8 Image of a Line 



According to Theorem 2, the invertible matrix 



A = 



3 1 
2 1 



maps the line y = 2x + 1 into another line. Find its equation. 



Solution 

Let (x, y) be a point on the line y = 2x+ 1. and let (x ! , y') be its image under multiplication by A. Then 

and 



3 1 
2 1 



"x" 




[3 ll 


-1 


M 




7 




2 1_ 




y 





1 -1 

-2 3 



so 



x = *'-/ 

7 = - 2*' H 3/ 



Substituting in y = 2x 4- 1 yields 



Thus (x',y') satisfies 

which is the equation we want. 



-2x' I 3y ' = 2 (x - y ! ) + 1 or, equivalency, y ' = ^x' + ± 



,-fr + I 



Exercise Set 9.2 



© 



Click here for Just Ask! 



1. 



Find the standard matrix for the linear operator x- R 2 * R 2 ^at maps a point (^7) into (see the accompanying figure) 



(a) its reflection about the line y = — x 



(b) its reflection through the origin 



(c) its orthogonal projection on the x-axis 



(d) its orthogonal projection on the y-axis 




«o 



* I.V. v» 



4 V 



lh) 



I 

I 



x 



if) 



J 

i 


!* — 


— mifcy) 


\. 






Ui) 


— *- 


Figi 


j re 


Ex-1 





For each part of Exercise 1, use the matrix you have obtained to compute T(2, 1). Check your answers geometrically by plotting 
2. the points (2, 1) and 7(2, 1). 



3. 



Find the standard matrix for the linear operator T: F? > F? that maps a point (x,y,z) into 



(a) its reflection through the xy-plane 



(b) its reflection through the ^-plane 



(c) its reflection through the yz-plane 



For each part of Exercise 3, use the matrix you have obtained to compute T(l, 1, 1). Check your answers geometrically by 
4. sketching the vectors (1, 1, 1) and T(l, 1, 1). 



Find the standard matrix for the linear operator j-j; 3 » fi^ that 

5. 

(a) rotates each vector 90° counterclockwise about the z-axis (looking along the positive z-axis toward the origin) 

(b) rotates each vector 90° counterclockwise about the x-axis (looking along the positive x-axis toward the origin) 

(c) rotates each vector 90° counterclockwise about the j-axis (looking along the positive j-axis toward the origin) 

Sketch the image of the rectangle with vertices (0, 0), (1, 0), (1, 2), and (0, 2) under 
6. 

(a) a reflection about the x-axis 

(b) a reflection about the y-axis 

(c) a compression of factor k = j in the ^-direction 

(d) an expansion of factor k — 2 in the x-direction 

(e) a shear of factor k = 3 in the x-direction 

(f) a shear of factor k = 2 in the ^-direction 

Sketch the image of the square with vertices (0, 0), (1, 0), (0, 1), and (1,1) under multiplication by 



7. 

,4 = 



-3 
1 



Find the matrix that rotates a point (x y) about the origin through 
8. 

(a) 45° 

(b) 90° 

(c) 180° 

(d) 270° 

(e) -30° 



Find the matrix that shears by 
9. 



(a) a factor of k = 4 in the v-direction 



(b) a factor of k = — 2 m the x-direction 



10. 



Find the matrix that compresses or expands by 



(a) a factor of ^ in the y-direction 



(b) a factor of 6 in the x-direction 



11. 



In each part, describe the geometric effect of multiplication by the given matrix. 



(a) 



3 
1 



(b) 



1 
-5 



(c) 



1 4 
1 



Express the matrix as a product of elementary matrices, and then describe the effect of multiplication by the given matrix in 
12. terms of compressions, expansions, reflections, and shears. 



(a) 



2 
3 



(b) 



1 4 

2 9 



(c) 



-2 
4 



(d) 



1 -3 
4 6 



13. 



In each part, find a single matrix that performs the indicated succession of operations: 



(a) compresses by a factor of ^ in the x-direction, then expands by a factor of 5 in the ^-direction 

(b) expands by a factor of 5 in the ^-direction, then shears by a factor of 2 in the ^-direction 

(c) reflects about y = x, then rotates through an angle of 180° about the origin 

In each part, find a single matrix that performs the indicated succession of operations: 
14. 

(a) reflects about the y-axis, then expands by a factor of 5 in the x-direction, and then reflects about y = x 

(b) rotates through 30° about the origin, then shears by a factor of -2 in the ^-direction, and then expands by a factor of 3 
in the ^-direction 

By matrix inversion, show the following: 
15. 

(a) The inverse transformation for a reflection about y = x is a reflection about y = x. 

(b) The inverse transformation for a compression along an axis is an expansion along that axis. 

(c) The inverse transformation for a reflection about a coordinate axis is a reflection about that axis. 

(d) The inverse transformation for a shear along a coordinate axis is a shear along that axis. 

Find the equation of the image of the line y = _ % | 3 under multiplication by 

b - 2 . 

In parts (a) through (e), find the equation of the image of the line y = 2x under 
17. 

(a) a shear of factor 3 in the x-direction 

(b) a compression of factor — in the j-direction 

(c) a reflection about y = x 

(d) a reflection about the y-axis 

(e) a rotation of 60° about the origin 



Find the matrix for a shear in the x-direction that transforms the triangle with vertices (0, 0), (2, 1), and (3, 0) into a right 
18. triangle with the right angle at the origin. 



19. 



(a) Show that multiplication by 



,4 = 



3 1 
6 2 



maps every point in the plane onto the line y = 2x- 

(b) It follows from part (a) that the noncollinear points (1,0), (0, 1), (-1, 0) are mapped on a line. Does this violate part (e) 
of Theorem 2? 



20. 



Prove part (a) of Theorem 2. 

Hint A line in the plane has an equation of the form j^ | By I C = 0, where A and B are not both zero. Use the method of 
Example 8 to show that the image of this line under multiplication by the invertible matrix 

a b 
^c d 

has the equation A*x + B*y + C = Q r where 

A f = (dA-cB)/(ad-hc) and B f = (-hA I aB)/(ad-hc) 

Then show that A 9 and B* are not both zero to conclude that the image is a line. 



21. 



Use the hint in Exercise 20 to prove parts (b) and (c) of Theorem 2. 



22. 



In each part, find the standard matrix for the linear operator T\ R^ * R^ described by the accompanying figure. 



jt- 



u. .>, .:> 



^^ 




-(&*,*) 




(W 



n. i. .. . 




ix.y.z) 



ic) 



In B^ the shear in the xy-direction with factor k is the linear transformation that moves each point {x,y,z) parallel to the x 
23' -plane to the new position [ x \ ky, y I kz, z)- (See the accompanying figure.) 



(a) Find the standard matrix for the shear in the xy-direction with factor L 



(b) How would you define the shear in the ^z-direction with factor k and the shear in the yz-direction with factor kl Fim 
the standard matrices for these linear transformations. 



J 


p 




/ 


(*,jftd 




/ 




h 


/ 






In each part, find as many linearly independent eigenvectors as you can by inspection (by visualizing the geometric effect of 
24. the transformation on R 2 ). For each of your eigenvectors find the corresponding eigenvalue by inspection; then check your 
results by computing the eigenvalues and bases for the eigenspaces from the standard matrix for the transformation. 



(a) reflection about the x-axis 



(b) reflection about the y-axis 



(c) reflection about y = x 



(d) shear in the x-direction with factor k 



(e) shear in the y-direction with factor k 



(f) rotation through the angle 9 
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9.3 In this section we shall use results about orthogonal projections in inner 

LEAST SOU A RES product spaces to obtain a technique for fitting a line or other polynomial 

Zl^ curve to a set of experimentally determined points in the plane. 
FITTING TO DATA 



Fitting a Curve to Data 

A common problem in experimental work is to obtain a mathematical relationship y = f (^) between two variables x and y 
by "fitting" a curve to points in the plane corresponding to various experimentally determined values of x and y, say 

(xuyi), 02,72),---, Oh-J^m) 

On the basis of theoretical considerations or simply by the pattern of the points, one decides on the general form of the curve 
y = f(x) to be fitted. Some possibilities are (Figure 9.3.1) 

(a) A straight line: y = a \ bx 

(b) A quadratic polynomial: y = a + bx 4- ex 

(c) A cubic polynomial: y = a + bx + ex 2 + dx 3 




(#) fc V = tf + &t 




(ft) ysff + fw + tt 2 




U l ) v = a + &r -k c-.v- + t/.T 3 
Figure 9.3.1 

Because the points are obtained experimentally, there is usually some measurement "error" in the data, making it impossible 
to find a curve of the desired form that passes through all the points. Thus, the idea is to choose the curve (by determining its 
coefficients) that "best" fits the data. We begin with the simplest and most common case: fitting a straight line to the data 
points. 

Least Squares Fit of a Straight Line 

Suppose we want to fit a straight line y = a -f bx to the experimentally determined points 

0l,7l), 02,72),-- 0h,7h) 

If the data points were collinear, the line would pass through all n points, and so the unknown coefficients a and b would 
satisfy 

y\ =a-\- bx\ 



We can write this system in matrix form as 



y n = a-\- bxn 



"1 *r 






>f 


1 *2 


'a' 


= 


72 


1 x n 






y» 



or, more compactly, as 



where 



y = 



72 

yn 



Mv = y 



M = 



"l 


*1 


1 


*2 


1 


x n 



(1) 



v = 



(2) 



If the data points are not collinear, then it is impossible to find coefficients a and b that satisfy system 1 exactly; that is, the 
system is inconsistent. In this case we shall look for a least squares solution 



v = v = 



We call a line y = a 4 b x whose coefficients come from a least squares solution a regression line or a least squares 
straight line fit to the data. To explain this terminology, recall that a least squares solution of 1 minimizes 

||y-Mv|| 

If we express the square of 3 in terms of components, we obtain 

||y-Mv|| 2 = Oi -a-bxi) 2 I (y 2 -a ~bx 2 ) 2 + -+ 0„ -a -bx n ) 2 



(3) 



(4) 



If we now let 



then 4 can be written as 



d\ = \y\-a-bxil d 2 = \y 2 -a -^2 1 



||y-Mv|| 2 =^ + ^ 2 + - + ^ 



dyi = [y H — a — bx n \ 



(5) 



As illustrated in Figure 9.3.2, ^ . can be interpreted as the vertical distance between the line y =a + bx and the data point 
(x u y t ). This distance is a measure of the "error" at the point (^ u y t ) resulting from the inexact fit of y = a + bx to the data 
points. The assumption is that the x^ are known exactly and that all the error is in the measurement of the y 3 . We model the 
error in the y 3 as an additive error — that is, the measured y 3 is equal to a f ^ . _|_ ^ . for some unknown error ^ .. Since 3 and 
5 are minimized by the same vector v + , the least squares straight line fit minimizes the sum of the squares of the estimated 
errors dp hence the name least squares straight line fit. 




Figure 9.3.2 



d, measures the vertical error in the least squares straight line. 



Normal Equations 

Recall from Theorem 6.4.2 that the least squares solutions of 1 can be obtained by solving the associated normal system 



M T Mv=M T y 
the equations of which are called the normal equations. 

In the exercises it will be shown that the column vectors of M are linearly independent if and only if the n data points do not 
lie on a vertical line in the ;ty-plane. In this case it follows from Theorem 6.4.4 that the least squares solution is unique and is 
given by 



v =(M T M) M T y 



In summary, we have the following theorem. 



THEOREM 9.3.1 



Least Squares Solution 

L et {x\*y\)> (*2> y2)>'~> (*h> yn) ^ e a set °f two or more data points, not all lying on a vertical line, and let 





1 XI 






>f 


M = 


1 *2 
1 * H 


and 


y = 


yi 
yn 



Then there is a unique least squares straight line fit 

y = a +0 x 

to the data points. Moreover, 





* 


* 


a 


V = 






, * 




b 



is given by the formula 

v ={M T M) M T y 

which expresses the fact that v = v* is the unique solution of the normal equations 

M T Mv=M T y 



(6) 



(V) 



EXAMPLE 1 Least Squares Line: Using Formula 6 



Find the least squares straight line fit to the four points (0, 1), (1, 3), (2, 4), and (3, 4). (See Figure 9.3.3.) 















i I ^ I 












f 













10 I 2 A 4 



Figure 9.3.3 



Solution 



We have 



M = 



"l 


o~ 


1 


1 


1 


2 


1 


3 



M T M = 



v* = (M T M) M r y = -L 



4 6 
6 14 



7 -3 
-3 2 



and (M J M) = 



10 



7 
-3 



-3 
2 





T 




1111" 


3 




"1.5" 


12 3 


4 




1 




4 





so the desired line is y = 1.5 + *. 



EXAMPLE 2 Spring Constant 



Hooke's law in physics states that the length x of a uniform spring is a linear function of the force y applied to it. If we write 
y = a + bx, then the coefficient b is called the spring constant. Suppose a particular unstretched spring has a measured length 
of 6.1 inches (i.e., x = 6. 1 when y = (j). Forces of 2 pounds, 4 pounds, and 6 pounds are then applied to the spring, and the 
corresponding lengths are found to be 7.6 inches, 8.7 inches, and 10.4 inches (see Figure 9.3.4). Find the spring constant of 
this spring. 



I ,, i 



£ Length 

1 ' 
!■ 



*i 


fi 


6.1 





7 6 


1 


8.7 


4 


[0.4 


6 



Turn; v 



Figure 9.3.4 



Solution 



We have 



M = 



"l 


6A~ 


1 


7.6 


1 


3.7 


1 


10.4 



and 



V = 



-1 



= (M 2 M) M'y^ 



8.6 
1.4 



where the numerical values have been rounded to one decimal place. Thus the estimated value of the spring constant is 
b *= 1.4 pounds/inch. 

Least Squares Fit of a Polynomial 

The technique described for fitting a straight line to data points generalizes easily to fitting a polynomial of any specified 
degree to data points. Let us attempt to fit a polynomial of fixed degree m 



m 



to n points 

fri,.yi), O^y?),---, fr»,.y») 
Substituting these n values of x and y into 8 yields the n equations 

72 =^0 + ^1*2 + »■ + a m X2 



(8) 



or, in matrix form, 



where 



y = 



.y M = flo + fliJ: n +-« + fl m x 



Mv=y 



m 



yi 


M 


1 
1 


xi xl 
x 2 x\ 


X l 

... r m 

x 2 


v = 


01 


yn 

equ. 


itions 


1 


x n x n 
M T My = 






^m 



(9) 



(10) 



As before, the solutions of the normal equations 



determine the coefficients of the polynomial. The vector v minimizes 

||y-Mv|| 
Conditions that guarantee the invertibility of M T M are discussed in the exercises. If M T M is invertible, then the normal 
equations have a unique solution v = v *> which is given by 



Jm _1 s/ r 



v = (M J M) M J y 



(11) 



Space Exploration 




^Temperature of Vcniman 
Atmosphere 

Magtilah orbit 32 ] 3 

Jate:-5 October 1901 

Latitude; 67 K 

LTST: 22:05 



100 ^ 



Ml 40 S3 60 70 KH tffl LCH^ 
Aliunde/? (.km) 



Source: Source: NASA 



On October 5, 1991 the Magellan spacecraft entered the atmosphere of Venus and transmitted the temperature Tin 
kelvins (K) versus the altitude h in kilometers (km) until its signal was lost at an altitude of about 34 km. Discounting the 
initial erratic signal, the data strongly suggested a linear relationship, so a least squares straight line fit was used on the 
linear part of the data to obtain the equation 



7 = 737.5-8.125^ 
By setting k = in this equation, the surface temperature of Venus was estimated at 7; 



:737.5K- 



EXAMPLE 3 Fitting a Quadratic Curve to Data 



According to Newton's second law of motion, a body near the earth's surface falls vertically downward according to the 
equation 

s = so+vo* + j^g* 2 (12) 

where 

s = vertical displacement downward relative to some fixed point 

sq = initial displacement at time t = Q 

vq = initial velocity at time t = Q 

g = acceleration of gravity at the earth's surface 

Suppose that a laboratory experiment is performed to evaluate g using this equation. A weight is released with unknown 

initial displacement and velocity, and at certain times the distances fallen relative to some fixed reference point are 

measured. In particular, suppose it is found that at times i = . 1, .2, .3, .4, and .5 seconds, the weight has fallen 

s = — 0. 18, 0.31, 1.03, 2.48, and 3.73 feet, respectively, from the reference point. Find an approximate value of g using these 

data. 



Solution 

The mathematical problem is to fit a quadratic curve 



s = a$ + a\t + &2^ 



(13) 



to the five data points: 

(.1,-0.18), (2,0.31), (3,1.03), (.4,2.48), 
With the appropriate adjustments in notation, the matrices M and y in 10 are 



(.5, 3.73) 



M = 



*2 t{ 
t 3 tj 

* * 2 



1 .1 .01 

1 .2 .04 

1 .3 .09 

1 .4 .16 

1 .5 .25 





"si" 




" — 0.18" 




S2 




0.31 


y — 


S3 


= 


1.03 




s 4 




2.48 




S5 




3.73 



Thus, from 11, 



v = 



1 



= (M'M) M 2 y = 



-0.40 
0.35 
16.1 



From 12 and 13, we have fl2 = — g> so the estimated value of g is 

g = 2fl2 = 2(16.1) = 32. 2 feet/ second 
If desired, we can also estimate the initial displacement and initial velocity of the weight: 

s^ = a^ = —0.40 feet 
vg^r^ = 0.35 feet / second 
In Figure 9.3.5 we have plotted the five data points and the approximating polynomial. 

i 

i 3 

E 2 

5 

-1 
. 

Time / (in seconds 
Figure 9.3.5 
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Exercise Set 9.3 



&■ 



Click here for Just Ask! 



Find the least squares straight line fit to the three points (0, 0), (1, 2), and (2, 7). 



Find the least squares straight line fit to the four points (0, 1), (2, 0), (3, 1), and (3, 2). 
2. 

Find the quadratic polynomial that best fits the four points (2, 0), (3, -10), (5, -48), and (6, -76). 
3. 

Find the cubic polynomial that best fits the five points (-1, -14), (0, -5), (1, -4), (2, 1), and (3, 22). 
4. 

Show that the matrix M in Equation 2 has linearly independent columns if and only if at least two of the numbers x\, *2> 
5. ... , x n are distinct. 

Show that the columns of the ^ x (m I 1 ) matrix M in Equation 10 are linearly independent if « > m and at least m I 1 of 
"• the numbers x\ 9 X2> •••>*« are distinct. 

Hint A nonzero polynomial of degree m has at most m distinct roots. 

Let M be the matrix in Equation 10. Using Exercise 6, show that a sufficient condition for the matrix jj^^j^f to be 
' • invertible is that n > m and that at least m + 1 of the numbers x\,X2* • • • > x n are distinct. 

The owner of a rapidly expanding business finds that for the first five months of the year the sales (in thousands) are $4.0, 
8. $4.4, $5.2, $6.4, and $8.0. The owner plots these figures on a graph and conjectures that for the rest of the year, the sales 
curve can be approximated by a quadratic polynomial. Find the least squares quadratic polynomial fit to the sales curve, 
and use it to project the sales for the twelfth month of the year. 

A corporation obtains the following data relating the number of sales representatives on its staff to annual sales: 

9. 

Number of Sales Representatives 5 10 15 20 25 30 

Annual Sales (millions) 3.4 4.3 5.2 6.1 7.2 8.3 

Explain how you might use least squares methods to estimate the annual sales with 45 representatives, and discuss the 
assumptions that you are making. (You need not perform the actual computations.) 

Find a curve of the form y — a 4. (hix) that best fits the data points (1, 7), (3, 3), (6, 1) by making the substitution 
!"• X = 1 / x- Draw the curve and plot the data points in the same coordinate system. 
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9.4 



In this section we shall use results about orthogonal projections in inner 



APPROXIMATION product spaces to solve problems that involve approximating a given 

PROBLE 

SERIES 



- F M ^ ■ FH 1 1 R T F R function by simpler functions. Such problems arise in a variety of 
U o , U U engineering and scientific applications. 



Best Approximations 

All of the problems that we will study in this section will be special cases of the following general problem. 

Approximation Problem Given a function/that is continuous on an interval [a, b], find the "best possible approximation" 
to/using only functions from a specified subspace Wof C\a, b] • 

Here are some examples of such problems: 

(a) Find the best possible approximation to & x over [0, 1] by a polynomial of the form flQ | a ^ x \ a2% 2 - 

(b) Find the best possible approximation to sm xx over [-1, 1] by a function of the form fl0 | a ^ e x \ ^e 2 * I a^e 3x - 

(c) Find the best possible approximation to x over [0, 2~] by a function of the form 

flg 4- (3 1 sin* I fl2sin2x I b\cosx I b2tos2x- 

In the first example Wis the subspace of C [0, 1 ] spanned by 1, x, and ^ 2 ; in the second example Wis the subspace of 

QT — 1,11 spanned by 1, g* , s 2x , and g 3 *; and in the third example Wis the subspace of C\0, 2tt1 spanned by 1, sm x. sin 2x, 

cosx> and cos 2x- 

Measurements of Error 

To solve approximation problems of the preceding types, we must make the phrase "best approximation over \a, £]" 
mathematically precise; to do this, we need a precise way of measuring the error that results when one continuous function is 
approximated by another over [a, b] . If we were concerned only with approximating f (x) at a single point *q, then the error 
at *o by an approximation g(^) would be simply 

error = l/(*o) -g(*o)l 

sometimes called the deviation between/and g at *q (Figure 9.4.1). However, we are concerned with approximation over the 
entire interval [a, b] , not at a single point. Consequently, in one part of the interval an approximation g\ to /may have 
smaller deviations from/ than an approximation g2 to/ and in another part of the interval it might be the other way around. 
How do we decide which is the better overall approximation? What we need is some way of measuring the overall error in an 
approximation g(*). One possible measure of overall error is obtained by integrating the deviation \f ( x ) — g(x) I over the 
entire interval [a, b] ; that is, 

error = / |/ (*) - g(x) \dx (1) 




Figure 9.4.1 



The deviation between /and g at xq. 



Geometrically, 1 is the area between the graphs of f (^) and g(x) over the interval \a, b] (Figure 9.4.2); the greater the area, 
the greater the overall error. 




Figure 9.4.2 



The area between the graphs off and g over [a, b] measures the error in approximating /by g over [a, b] . 



Although 1 is natural and appealing geometrically, most mathematicians and scientists generally favor the following 
alternative measure of error, called the mean square error. 



mean square error 



Jt2 



[fw-gwrdx 



Mean square error emphasizes the effect of larger errors because of the squaring and has the added advantage that it allows 
us to bring to bear the theory of inner product spaces. To see how, suppose that/ is a continuous function on [ flj b] that we 
want to approximate by a function g from a subspace W of C\a, b] , and suppose that C\a, b] is given the inner product 

(f,g}= f f(x)g(x)dx 
It follows that 

rb 
2 I 2 

ll f -g|| ={ f = g> f= -g)=/ [/CO -gOO] afr = mean square error 

so minimizing the mean square error is the same as minimizing ||f — g|| . Thus the approximation problem posed informally 
at the beginning of this section can be restated more precisely as follows: 

Least Squares Approximation 

Least Squares Approximation Problem Let/ be a function that is continuous on an interval [ flj £] , let C\a, b] have the 
inner product 



<f 



Ja 



f(x)g(x)dx 



and let Wbe a finite-dimensional subspace ofC[a,b]. Find a function g in Wthat minimizes 

■b 



\\t 






[f(x)-g(x)Vdx 



Since ||f — g|| and ||f — g|| are minimized by the same function g, the preceding problem is equivalent to looking for a 

function g in Wthat is closest to/. But we know from Theorem 6.4.1 that g = proj^f is such a function (Figure 9.4.3). Thus 

we have the following result. 

f= fimciiun in C\u. I>] 
to be approximated 




P = proj^.f = leasl squ^ir^.s 

W - vector space of io f lronra W 

approximating 
funclioFis 



Figure 9.4.3 



Solution of the Least Squares Approximation Problem If/ is a continuous function on [a 9 b], and Wis a 

finite-dimensional subspace of C[a 9 h], then the function g in Wthat minimizes the mean square error 



L 



[fW-g(x)] 2 dx 



is a = proj^-f 9 where the orthogonal projection is relative to the inner product 

(f,g}= f f(x)g(x)dx 
The function g = proj^-f is called the least squares approximation to /from W. 

Fourier Series 

A function of the form 

r(x) =c\} + c\cosx I ^2 cos 2jr H hc^cos^x 

+ d\ sin* -h d? 2 sin 2* H 1 d n sin^* ^ ' 

is called a trigonometric polynomial, ifc n and ^ H are not both zero, then T { x ) is said to have order n. For example, 

r(x) = 2 + cos* — 3 cos 2x I 7sin4x 
is a trigonometric polynomial with 

eg = 2, ^1 = 1,^2— — 3, £3 = 0,^4 = 0, d\ = Q, d2 = 0,d2 = 0, d^ = l 
The order of r (x) is 4. 

It is evident from 2 that the trigonometric polynomials of order n or less are the various possible linear combinations of 

1, cos*, cos 2*,..., cosw*, sin*, sin 2*, ..., sinw* ,~. 

It can be shown that these 2w I 1 functions are linearly independent and that consequently, for any interval [a,h], they form 
a basis for a (2h H 1) -dimensional subspace of C[a, A] . 

Let us now consider the problem of finding the least squares approximation of a continuous function / (j) over the interval 
[0, 2tt] by a trigonometric polynomial of order n or less. As noted above, the least squares approximation to/ from Wis the 
orthogonal projection of/ on W. To find this orthogonal projection, we must find an orthonormal basis go, gi, ..., g2 H f° r W> 
after which we can compute the orthogonal projection on Wfrom the formula 

proj^f = {f,g )g I {f, gi)gi+»- + {f, B2m)B2m ( 4 ) 



[see Theorem 6.3.5]. An orthonormal basis for Wean be obtained by applying the Gram-Schmidt process to the basis 3, 
using the inner product 

<2tt 



Jo 



This yields (Exercise 6) the orthonormal basis 



; = -=, gi = -j=co$x,..., g n = — j= cos nx, 

g H+ l = -= sin*, ..., g 2 „ = -j= sinOT 



If we introduce the notation 



2 f | ill ill 

a = -7= f, SO, ai =-f= f, Bl ,---, a„ = -f= f, g H 

U 277 I U7T | U7T | 

UTiT I tf fi" I 



where 



"2?r h h /«2w 



ao = -j= f, SO 



-^ f, g0 L-L, / *f( x )^ dx = lf *f(x)dx 
^2sr| J ^2tt Jo ^2tt ' l Jo 

\ I \ I f 2n 1 1 Z" 2 * 

= 1= f - Bl = —F= f f(x)^=COSxdx = - f (*)CQS X i 

yii) ^n JO ^x ' l Jo 



111/ 1 1 ,1 /" 

a h = — j= f , g H = — j= / f(x)— = cosnxdx = — I f (x)cosftxdx 

jM " I ^77 JO ' V 77 " ^° 

1 I I l Z 27r 1 l f 2 ^ 

£ 1= -= f, g M+1 =-= / f(x)^=smxdx = - / /OOanjcdfr 

l I I l Z 27r l l f 2 * 

b n = -= f , g 2 „ = -j= / f (x)—^ smkxdx = - f f(x)smnxdx 
^x) ^ tit Jo |/w "Jo 



In short, 



If , , i P 

= — f f (x) cos kxdx, bfc = — f f (x) smkxdx 

^Jo x Jo 

The numbers a& a\, . . ., a n , £ 1? ...,J are called the Fourier coefficients off 

EXAMPLE 1 Least Squares Approximations 

Find the least squares approximation of f (^) = ^ on [0, 2tt] by 



(5) 



(6) 



then on substituting 5 in 4, we obtain 

projf^f = -^-4 [a\cosx-\ hfl H cos^^] I [iisin^H hA H sintf*] ,n\ 



2?r 1 t2t 

a^ = — I f(x) cos kxdx, b^ = — f f(x) smkxdx (8) 



1. a trigonometric polynomial of order 2 or less; 



2. a trigonometric polynomial of order n or less. 



Solution (a) 



ao 



= -[ f(x)dx = ±[ xdx = 2n 



For k = 1, 2, ..., integration by parts yields (verify) 

■2- 



1 f 2n ill f l7T 

&k = — f f (*) cos kx dx = — f x cos kx dx = 

1 f 2 ~ . 1 f 2n 

■jfc = — I f (x) sinkx dx = — I x si 



k 



Thus the least squares approximation to ion [0, 2~] by a trigonometric polynomial of order 2 or less is 
or, from 9a, 9b, and 9c, 



x ~ -£- + a\ cos* I <32 cos 2* I ii sin* I £2 sin 2* 



x 2^ 77 — 2 sin x — sin 2* 



Solution (b) 



The least squares approximation to ion [0, 2ir] by a trigonometric polynomial of order n or less is 
or, from (9a), (9b), and(9c), 



x ^ -£- + [a\ cos* H h a n cos «*] I [b\ sin* I h i H sinw*] 



*~ ;r-2[sin* I -^L I -^ f ■..+ SmM 



2 3 » 

The graphs of y = x and some of these approximations are shown in Figure 9.4.4. 



Sill I -1 



V = it - 2 sin r 




S j (1x +^Ll£ + «lJ^ + ^L±i , 



w - 1 1 sin jt + — — + — — l 



: ') 



Figure 9.4.4 



(9a) 

(9b) 
(9c) 



It is natural to expect that the mean square error will diminish as the number of terms in the least squares approximation 



/OO — "^ + 51 (flfc cos fe I ift sinfcr) 
' * jt=i 

increases. It can be proved that for functions /in C[0 7 2ir] , the mean square error approaches zero as w > 4. 00 ; this is 

denoted by writing 

f(x) = -Jr 4- 5^ (ft ft cos ^ I i^ sinfct) 

The right side of this equation is called the Fourier series for/over the interval f 0, 2~1 • Such series are of major importance 
in engineering, science, and mathematics. 




Jean Baptiste Joseph Fourier (1768-1830) was a French mathematician and physicist who discovered the Fourier series 
and related ideas while working on problems of heat diffusion. This discovery was one of the most influential in the 
history of mathematics; it is the cornerstone of many fields of mathematical research and a basic tool in many branches of 
engineering. Fourier, a political activist during the French revolution, spent time in jail for his defense of many victims 
during the Terror. He later became a favorite of Napoleon and was named a baron. 



Exercise Set 9.4 



& 



Click here for Just Ask! 



Find the least squares approximation of f (x) = 1 + x over the interval [Q ? 2tt] by 



(a) a trigonometric polynomial of order 2 or less 



(b) a trigonometric polynomial of order n or less 



Find the least squares approximation of / (x) =x over the interval [0, 2k] by 



(a) a trigonometric polynomial of order 3 or less 

(b) a trigonometric polynomial of order n or less 



3. 

(a) Find the least squares approximation of x over the interval [0, 1] by a function of the form a 4- be 7 '. 



(b) Find the mean square error of the approximation. 



4. 

(a) Find the least squares approximation of g x over the interval [0, 1] by a polynomial of the form a I a\x. 



(b) Find the mean square error of the approximation. 



5. 

(a) Find the least squares approximation of sin?rx over the interval [-1, 1] by a polynomial of the form 



(b) Find the mean square error of the approximation. 

Use the Gram-Schmidt process to obtain the orthonormal basis 5 from the basis 3. 
6. 



Carry out the integrations in 9a, 9b, and 9c. 
7. 



Find the Fourier series of f (^) = ^ _ x over the interval [ 0, 2ir"| |- 



Find the Fourier series of / ( x ) = 1, < x < n and f f x ) = 0, tt < x < 2tt over the interval [ 0, 2n] . 
9. 



What is the Fourier series of sin(3*)? 
10. 
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9.5 

QUADRATIC FORMS 



In this section we shall study functions in which the terms are squares of 
variables or products of two variables. Such functions arise in a variety of 
applications, including geometry, vibrations of mechanical systems, 
statistics, and electrical engineering. 



Quadratic Forms 

Up to now, we have been interested primarily in linear equations — that is, in equations of the form 

a 1*1 I <32*2 H \-a yi x yi = b 

The expression on the left side of this equation, 

fll*l +a2X2+'" + a nXn 
is a function of n variables, called a linear form. In a linear form, all variables occur to the first power, and there are no 
products of variables in the expression. Here, we will be concerned with quadratic forms , which are functions of the form 



2 2 2 

a\x^ +LZ2*2 +"" + |3 h*h + (all possible terms of the form a^-x^xj for i <j) 

For example, the most general quadratic form in the variables x\ and *2 is 

2 2 

fll*! I ^2*2 + fl 3*l*2 

and the most general quadratic form in the variables x\, x 2 , an d x 3 is 

2 2 2 



(1) 



(2) 



(3) 



The terms in a quadratic form that involve products of different variables are called the cross-product terms. Thus, in 2 the 
last term is a cross-product term, and in 3 the last three terms are cross-product terms. 



[*1 *2] 



(4) 



If we follow the convention of omitting brackets on the resulting 1 x 1 matrices, then 2 can be written in matrix form as 

a\ A3 / 2 Yx\^ 
a^l 2 ^2 |_*2 
and 3 can be written as 

a\ a^l 2 a$l 2 
[xi x 2 x 3 ] a 4 /2 a 2 a 6 f2 
a$ / 2 a$ I 2 a^ 

(verify by multiplying out). Note that the products in 4 and 5 are both of the form x T Ax> where x is the column vector of 
variables, and A is a symmetric matrix whose diagonal entries are the coefficients of the squared terms and whose entries off 
the main diagonal are half the coefficients of the cross-product terms. More precisely, the diagonal entry in row i and column 
i is the coefficient of ^2, and the off-diagonal entry in row i and column j is half the coefficient of the product XjXy Here are 
some examples. 



*2 
*3 



(5) 



EXAMPLE 1 Matrix Representation of Quadratic Forms 



2x 2 I 5xy-ly 2 = [x y] 



4x 2 -5y 2 =[* y] 



xy=[* y] 



xj | lx\-1x\ | 4x1^2-2x^3 | 6x2x3= [*1 x 2 x 3] 



"2 3 


r*~ 


3 -7 


[y 


"4 


Vx~ 


_0 -5 


[y 


"° *lr 


x~ 


i oJL 


y 



1 2 


-1" 


"*f 


2 7 


3 


*2 


-1 3 


-3 


*3 



Symmetric matrices are useful, but not essential, for representing quadratic forms. For example, the quadratic form 
2* I &xy — ly ■ , which we represented in Example 1 as x T Ax with a symmetric matrix A, can also be written as 



2x 2 -{ 6xy-ly 2 = [* y] 



2 5 
1 -7 



where the coefficient 6 of the cross-product term has been split as 5 I 1 rather than 3 I 3, as in the symmetric representation. 

However, symmetric matrices are usually more convenient to work with, so when we write a quadratic form as x T Ax> it will 

always be understood, even if it is not stated explicitly, that A is symmetric. When convenient, we can use Formula 7 of 

Section 4.1 to express a quadratic form x T j{ x in terms of the Euclidean inner product as 

T T 

x Ax = Ax - x or by symmetry of the dot product, x Ax = x - Ax 

If preferred, we can use the notation u - v = {\\ 7 v) for the dot product and write these expressions as 



x T Ax = x T (Ax) = {Ax, x) = {x, Ax) 



(6) 



Problems Involving Quadratic Forms 

The study of quadratic forms is an extensive topic that we can only touch on in this section. The following are some of the 
important mathematical problems that involve quadratic forms. 



Find the maximum and minimum values of the quadratic form x ^ x if x is constrained so that 



J , 2 



1/2 



x|| = (*i I *2 +■■■ + *„) =1 



What conditions must A satisfy in order for a quadratic form to satisfy the inequality x T Ax > f° r all x # 0? 



If x T Ax is a quadratic form in two or three variables and c is a constant, what does the graph of the equation x T Ax = c 
look like? 



If P is an orthogonal matrix, the change of variables x = Py converts the quadratic form x T j{ x to 

(Py) T A(Py) = y T (P T AP)y. But p T AP is a symmetric matrix if A is (verify), so y T (P T AP)y is a new quadratic form in 

the variables of y. It is important to know whether P can be chosen such that this new quadratic form has no 
cross-product terms. 



In this section we shall study the first two problems, and in the following sections we shall study the last two. The following 
theorem provides a solution to the first problem. The proof is deferred to the end of this section. 



THEOREM 9.5.1 



Let A be a symmetric nxn matrix with eigenvalues \\ > A2 > — > A H - Ifx is constrained so that ||x|| = 1, then 

(a) A!>x r ^>A H - 

(b) x T ^x > A tf x ^ an eigenvector of A corresponding to \ n and x T As. = Ai tf x ^ an eigenvector of A 
corresponding to \\- 



t follows from this theorem that subject to the constraint 



.2 , 2 



1/2 



||x|| = (xf I ^+.- + ^) =1 

the quadratic form x T j{ x has a maximum value of X\ (the largest eigenvalue) and a minimum value of \ n (the smallest 
eigenvalue). 



EXAMPLE 2 Consequences of Theorem 9.5.1 



Find the maximum and minimum values of the quadratic form 

2 2 
x \ +x 2 + ^ x 1*2 

subject to the constraint ^2 . ^2 _ |, and determine values of x\ and *2 at which the maximum and minimum occur. 



Solution 

The quadratic form can be written as 



*?+*2 + 4*1*2 =* T ^*= [*1 x 2] 



1 2 

2 1 



*l 
*2 



The characteristic equation of A is 

det(A/ - A) = det 



A-l -2 
-2 A-l 



= A J -2A-3 = (A-3)(A-|- 1) = 



Thus the eigenvalues of A are \ = 3 and \ = _ 1, which are the maximum and minimum values, respectively, of the 
quadratic form subject to the constraint. To find values of x\ and ^2 at which these extreme values occur, we must find 
eigenvectors corresponding to these eigenvalues and then normalize these eigenvectors to satisfy the condition x 2 j ^2 _ |. 



We leave it for the reader to show that bases for the eigenspaces are 

1 



A = 3: 



1 



X= -1: 



1 

-1 



Normalizing these eigenvectors yields 



1//2 
1//2 



-1//2 



Thus, subject to the constraint x ^ \ x 2 = \, the maximum value of the quadratic form is \ — 3, which occurs if x . = ] / J2, 
X2 = ]f ,/2; and the minimum value is X = — L which occurs if ^ = ] / J~2, x?= — 1 / 1/2- Moreover, alternative bases for 
the eigenspaces can be obtained by multiplying the basis vectors above by -1. Thus the maximum value, X — 3, also occurs if 
*1 = - 1 / \f2>x 2 = -\i i[?> similarly, the minimum value, A = - 1, also occurs if Xl = -\ f fe, x 2 = \ f fe- 



DEFINITION 



A quadratic form x ^ is called positive definite if x T Ax > f° r all x ^ 0, and a symmetric matrix A is called a positive 
definite matrix if x ^As is a positive definite quadratic form. 



The following theorem is an important result about positive definite matrices. 



THEOREM 9.5.2 



A symmetric matrix A is positive definite if and only if all the eigenvalues of A are positive. 



Proof Assume that A is positive definite, and let A be any eigenvalue of A. If x is an eigenvector of A corresponding to A, 
then x 5t and Ax: = Ax, so 

0<x r ^x = x r Ax = Ax r x = A||x|| 2 (7) 

where ||x|| is the Euclidean norm of x. Since ||x|| 2 >0 it follows that A>0, which is what we wanted to 

show. 

Conversely, assume that all eigenvalues of A are positive. We must show that x T Ax > f° r all x ^ 0- But if x ^ 0, we can 

normalize x to obtain the vector y = xf ||x|| with the property ||y|| = 1. It now follows from Theorem 9.5.1 that 

y T Ay>\ n >0 
where A H is the smallest eigenvalue of A. Thus, 

* ' \\\x\\ I INI / || x |p 

Multiplying through by ||\-|p yields 



x T Ax>0 



which is what we wanted to show. 



EXAMPLE 3 Showing That a Matrix Is Positive Definite 



In Example 1 of Section 7.3, we showed that the symmetric matrix 

"4 2 2" 



,4 = 



2 4 2 
2 2 4 



has eigenvalues \ = 2 and \ = g. Since these are positive, the matrix A is positive definite, and for all x ± 0. 

x T Ax = 4xj I 4*! I 4xj I 4*i*2 I 4* 1*3 I 4*2*3 > 



Our next objective is to give a criterion that can be used to determine whether a symmetric matrix is positive definite without 
finding its eigenvalues. To do this, it will be helpful to introduce some terminology. If 

"a 11 a\2 '•' fllw" 
fl21 ^22 - <32h 



,4 = 



a n \ a n 2 - a? 



is a square matrix, then the principal submatrices of A are the submatrices formed from the first r rows and r columns of A 
for r = 1, 2, ..., ». These submatrices are 



^l = [ail]. ^2 = 



^11 «12 
«21 fl 22 



A 3 = 



flll 


A12 


fll3" 


«21 


"322 


"323 


fl 3 l 


«32 


«33 



A n — A — 



an a\2 - aiM 
fl21 fl 22 - ^2h 

fl Hl fl H2 L " fl HH 



THEOREM 9.5.3 



A symmetric matrix A is positive definite if and only if the determinant of every principal submatrix is positive. 



We omit the proof. 



EXAMPLE 4 Working with Principal Submatrices 



The matrix 



is positive definite since 



121 = 2 

K* — 6 ? 







2 -1 


-3 






,4 = 


-1 2 
-3 4 


4 
9 






2 - 
-1 


1 
2 


= 3, 


2 
-1 
-3 


-1 
2 
4 


-3 
4 
9 



= 1 



all of which are positive. Thus we are guaranteed that all eigenvalues of A are positive and x T j\x > f° r all x ^ 0- 



Remark A symmetric matrix A and the quadratic form x T j[ x are called 



positive semidefinite if x T Ax > for all x 



negative definite if x T Ax < for x # 

negative semidefinite if % r Av < for all x 

indefinite if x ^Ac has both positive and negative values 

Theorems Theorem 9.5.2 and Theorem 9.5.3 can be modified in an obvious way to apply to matrices of the first three types. 
For example, a symmetric matrix A is positive semidefinite if and only if all of its eigenvalues are nonnegative. Also, A is 
positive semidefinite if and only if all its principal submatrices have nonnegative determinants. 

Optional 



Proof of Theorem 9.5. 1a Since A is symmetric, it follows from Theorem 7.3.1 that there is an orthonormal basis for ^" 
consisting of eigenvectors of A. Suppose that S= (y 1? V 2, ---, v H } is such a basis, where y 3 is the eigenvector corresponding 
to the eigenvalue A 2 - If (, ) denotes the Euclidean inner product, then it follows from Theorem 6.3.1 that for any x in R n , 

x = (x, vi )vi Hh {x, v 2 )v 2 + - + {x, v H )v M 

Thus 

,4x = (x, vi }^4vi + (x, v 2 }^V2 + - + (x, v n )Av n 

= {*> vi}Aivi 4- (x, v 2 }A 2 v 2 H h (x, v H }A H v H 

= A l {*> v l } v l + A 2{*> v 2 Jv 2 + - + A H (x, v M )v B 
It follows that the coordinate vectors for x and Ax relative to the basis S are 

( x ) S= (( x > v l }- { x > v 2}. ---, (x, v H }) 
(Ax) s = (Ai(x ? vi }, A 2 (x ? v 2 } ? ... ? A H (x ? v H }) 

Thus, from Theorem 6.3.2c and the fact that ||x|| = 1, we obtain 

||x|| 2 = (x, vi } 2 + (x, v 2 } 2 + ... + (x, v H } 2 = 1 
jx,,4x) = Ai(x,vi} 2 I A 2 (x,v 2 } 2 +- + A H (x,v H } 2 

Using these two equations and Formula 6, we can prove that x T Ax<\i as follows: 

x r ,4x = jx, Ax\ = Ai(x, vi } 2 + A 2 (x, v 2 } 2 \- - + A„(x, v H } 2 

<Aj(x,vi} 2 I Ai(x,v 2 } 2 h- + Ai(x ? v H } 2 

= A J ((x ? v 1 } 2 + (x,v 2 } 2 + + (x ? v H } 2 ) 

= Al 
The proof that x n <x T Ax is similar and is left as an exercise. 



Proof of Theorem 9.5.1b If x is an eigenvector of A corresponding to \\ and ||x|| = 1, then 

x T Ax = (x, Ax) = (x, Aixj = Ai(x, x) = Ai ||x|| 2 = Ai 



Similarly, K T Ax = X yi if IN = 1 and x is an eigenvector of A corresponding to A H - 



Exercise Set 9.5 



Click here for Just Ask! 



Which of the following are quadratic forms? 
1. 



(a) x 2 -{2xy 

<W 5x\-2xl I Ax i* 2 

(c) Ax\-2xl i *|-5*i* 3 

^ * 2 -7* 2 I *| I 4* i^ 2 ^3 

(e) *i*2 — 3*i*3 4- 2*2*3 

(f) * 2 -6*J I *i-5* 2 

(g) (*i-3* 2 ) 2 

(h) (*,-*3) 2 | 2(*i I 4*?) 2 

Express the following quadratic forms in the matrix notation x T Ax> where A is a symmetric matrix. 

2. 



< a) 3*? I 7* 2 



( b) 4* 2 - 9xj -6xix 2 



(C) 5* 2 I 5*i*2 
(d) -7*i*2 



Express the following quadratic forms in the matrix notation x T Ax> where A is a symmetric matrix. 



v ' 9x, — x 9 +4x ? -I- 6x1X2 — 8x1X3 I X2X3 



^ x\ I X2-3xj- 5x\X2 I 9xix 3 



(c) x\X2 I X1X3+X2X3 



^ {2x\-fexl I 2^/2x1x2-8^1x3 



/ \ o o o o 

^ ' Xj I x 2 — x 3 — x 4 I 2xiX2 — 10xiX4 1-4x3x4 



In each part, find a formula for the quadratic form that does not use matrices. 



(a) r* 



[* y] 



2 -3 

3 5 



(b) 



[*1 *2] 



HI 


~*1~ 


[f "\ 


*2 



(c) 



[x y z] 






0~ 


"x" 


3 





y 





5 


z 



(d) 



[*1 *2 *3] 



9 


7 


r 






2 


2 


r* 1 ! 


7 





6 


x? 


2 






X3 


1 








fi 


3 




2 









(e) 



[xi x 2 x 3 x 4 ] 






1 


1 


f 


~*l" 


1 





1 


1 


*2 


1 


1 





1 


*3 


1 


1 


1 





x 4 



In each part, find the maximum and minimum values of the quadratic form subject to the constraint ^2 ( 2 _ ^ an d 
determine the values of x\ and *2 at which the maximum and minimum occur. 



<» 5*1-4 



(b) 7x? I 4x? I xix 2 



( c) 5x 2 I 2x\-x x x 2 
^ 2x 2 |-^| + 3^i^2 



6. 



In each part, find the maximum and minimum values of the quadratic form subject to the constraint x 2 -\- x 2 -\- x 2 = h an ^ 
determine the values of x\, X2, and X3 at which the maximum and minimum occur. 



( a ) x 2 I x\ I 2^3-2^1^2 i *<x\xi + Ax2Xi 



^ 2x\+x\ + x 2 A 2xix 3 + 2xix 2 



(c) 3x 2 I 2x\ I 3^ + 2x1^3 



7. 



Use Theorem 9.5.2 to determine which of the following matrices are positive definite. 



(a) 



2 3 

3 2 



(b) 



5 -1 
-1 5 



(c) 



2 -2 
-2 -1 



8. 



Use Theorem 9.5.3 to determine which of the matrices in Exercise 7 are positive definite. 



Use Theorem 9.5.2 to determine which of the following matrices are positive definite. 



(a) 



3 -1 

-1 2 

-1 





-1 
3 



(b) 



"0 


1 


r 


1 





i 


1 


1 






(c) 



"1 


2 


f 


2 


1 


1 


1 


1 


3 



10. 



Use Theorem 9.5.3 to determine which of the matrices in Exercise 9 are positive definite. 



In each part, classify the quadratic form as positive definite, positive semidefinite, negative definite, negative 
11- semidefinite, or indefinite. 



(a) *? i A 



O) -,?-3* 2 2 

(c) (*i-* 2 ) 2 

(d) -(*!-* 2 ) 2 



(e) x 2 x 2 



(f) *1*2 



In each part, classify the matrix as positive definite, positive semidefinite, negative definite, negative semidefinite, or 
12. indefinite. 



(a) 



3 


0" 





-2 





1 



(b) 



-5 





0" 

















1 



(c) 



"6 7 


r 


7 9 


2 


1 2 


i 



(d) 



■4 7 

7 -3 

8 9 



(e) 



"0 





0" 





















(f) 



"1 





0" 





1 











1 



13. 



Let x T A\ be a quadratic form in x i , X2, ■ ■■, x n and define T: R n — ► R by T(x) — x Ax. 



(a) Show that T(x + y) = 7*(x) + 2x r ^y + T(y) . 



(b) Show that T(kx) = k 2 T(x). 



(c) Is T a linear transformation? Explain. 



14. 



In each part, find all values of k for which the quadratic form is positive definite. 



( a ) x\ I kx\- Ax ix 2 



^ 5x\ I ^2 I kxj I 4^1^2-2x1^2-2^1^3 



^ ' 2x\ I j:-? I 2x*i I 2j:ij:3 I 2^X2^3 



15. 



Express the quadratic form (c\x\ I C2X2 H I- Cy^n) 2 m tne matr i x notation K T Ax, where A is symmetric. 



16. 



Let x = {x\, X2, ..., *«)• m statistics, the quantity 



x = ^(*l+*2+™ + *n) 



is called the sample mean of jq, X2, • ••, x H , and 



«-l 
is called the sample variance. 



sl = — MOl-*) 2 I to-*) 2 H-+(* H -*) 2 ] 



(a) Express the quadratic form <p> in the matrix notation x ^ x , where A is symmetric. 



(b) Is 2 a positive definite quadratic form? Explain. 



Complete the proof of Theorem 9.5.1 by showing that \ < x T ^_ if ||x|| = 1 and \ — x ^^ if x is an eigenvector of A 
1 ' • corresponding to A H - 



Discussion 

D/scoverv Indicate whether each statement is true (T) or false (F). Justify your answer. 



18. 



(a) A symmetric matrix with positive entries is positive definite. 



( b ) xj-xj \ xj \ %i^ 2 ^3 isaC l Uadraticf0rm - 



(c) (%i — 3*2) * s a quadratic form. 



(d) A positive definite matrix is invertible. 



(e) A symmetric matrix is positive definite, negative definite, or indefinite. 



(f) If A is positive definite, then — ^ is negative definite. 



19. 



Indicate whether each statement is true (T) or false (F). Justify your answer. 



(a) If x is a vector in R n , then x - x is a quadratic form. 



(b) If x ^ x is a positive definite quadratic form, then so is x ^ _1 x . 



(c) If A is a matrix with positive eigenvalues, then x T Ax is a positive definite quadratic 
form. 



(d) If A is a symmetric 2x2 matrix with positive entries and a positive determinant, then A 
is positive definite. 



(e) If x T Ax is a quadratic form with no cross-product terms, then A is a diagonal matrix. 



(f) If x T Ax is a positive definite quadratic form in x and y, and ifc&Q, then the graph of 
the equation x T Ax = ^ is an ellipse. 



What property must a symmetric 2x2 matrix A have for x T Ax. = 1 to represent a circle? 
20. 
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9.6 

DIAGONALIZING 
QUADRATIC FORMS; 
CONIC SECTIONS 



In this section we shall show how to remove the cross-product terms from a 
quadratic form by changing variables, and we shall use our results to study 
the graphs of the conic sections. 



Diagonalization of Quadratic Forms 



Let 



x'Ax= [xi x 2 - x n ] 



a n 


«12 " 


■" a \n 


«21 


«22 " 


" G2n 


a n \ 


fl H2 " 


" a nn 



xi 
X2 



(1) 



be a quadratic form, where A is a symmetric matrix. We know from Theorem 7.3.1 that there is an orthogonal matrix P that 
diagonalizes A; that is, 

^Ai - 



P i AP = D = 



where X\, A 2 > • • •> A H are the eigenvalues of A. If we let 



A 2 




y\ 
yi 



yn 



and make the substitution x = Py in 1, then we obtain 



T D T 



x 2 Ax= (Py) 'APy^y'P'APy^ y J Dy 



But 



y Dy = [71 yi - yn] 



Ai 

A 2 





" 


>r 





72 


A H 


7h 



= A17? + A2J2 + - 4- A^ 



which is a quadratic form with no cross-product terms. 
In summary, we have the following result. 



THEOREM 9.6.1 



Let 
the 


x T Ax ^ e a quadratic form in 
new variables y ]f y 7 , ... y M i 


the variables x\, *2> •••> x n 
ire defined by the equation \ 


where A is symmetric. IfP orthogonally diagonalizes A, 
: = Py, then substituting this equation in x T j[ x yields 


and if 



x T A\ = y T Dy = Xiy^ I X^yj + - + X^n 



where Ai, A 2 , ■■■, A H are the eigenvalues of A and 





"Ai ■ 


■■ o" 


D = P T AP = 


A 2 ■ 


■■ 




■ 


■ x n 



The matrix P in this theorem is said to orthogonally diagonalize the quadratic form or reduce the quadratic form to a sum of 
squares. 



EXAMPLE 1 Reducing a Quadratic Form to a Sum of Squares 



Find a change of variables that will reduce the quadratic form ^ _^ 2 _ 4^1*2 I 4* 2*3 t0 a sum °^ sc l uares ' an ^ express the 
quadratic form in terms of the new variables. 



Solution 

The quadratic form can be written as 



[*1 *2 *3] 



1 


-2 


0" 


~*1~ 


2 





2 


*2 





2 


-1 


*3 



The characteristic equation of the 3 x 3 matrix is 

A-l 2 

2 A -2 

-2 A + 1 



= A J - 9A = A(A + 3) (A - 3) = 



so the eigenvalues are A = 0> A = — 3> A = 3- We leave it for the reader to show that orthonormal bases for the three 
eigenspaces are 



A = 0: 



A= -3: 



A = 3: 



Thus, a substitution x = Py that eliminates cross-product terms is 



*1 
*2 
*3 



y\ 
yi 

73 



or, equivalently, 



*1= §71-^2 -§73 

X2 = \y\-\y2 + \yz 

*3 = §J'l+fy2 + jJ'3 



The new quadratic form is 



[71 72 73] 









71 





-3 


72 





3 


73 



or, equivalently, 



-3y 2 + 3y 2 



Remark There are other methods for eliminating the cross-product terms from a quadratic form; we shall not discuss them 
here. Two such methods, Lagrange's reduction and Kronecker's reduction, are discussed in more advanced books. 

Conic Sections 



We shall now apply our work on quadratic forms to the study of equations of the form 

ax 2 + 2bxy + cy 2 + dx + ey + f = 



(2) 



where a, b, . . .,/are real numbers, and at least one of the numbers a, b, c is not zero. An equation of this type is called a 
quadratic equation in x and y, and 



ax + 2bxy + cy 



is called the associated quadratic form . 



EXAMPLE 2 Coefficients in a Quadratic Equation 



In the quadratic equation 



the constants in 2 are 



3* 2 I 5xy -7y 2 I 2* + 7 = 



a = 3, b = i, c=-l, d = 2, e = 0, f =7 

2 -J 



EXAMPLE 3 Examples of Associated Quadratic Forms 



Quadratic Equation 



Associated Quadratic Form 



3* 2 ^ 5xy - 7y 2 + 2x + 7 = 



4 X 2 _ 5 y 2 + Sy + 9 = 



*y +7 = o 



3* 2 I 5xy -ly 2 



4x 2 - 5y 2 



^Z 



Graphs of quadratic equations in x and y are called conies or conic sections. The most important conies are ellipses, circles, 
hyperbolas, and parabolas; these are called the nondegenerate conies. The remaining conies are called degenerate and include 



single points and pairs of lines (see Exercise 15). 



A nondegenerate conic is said to be in standard position relative to the coordinate axes if its equation can be expressed in one 
of the forms given in Figure 9.6.1. 



(<i\ 



k r ' \ — frrv 






t-*,a> 



v 










*=i 



(A) 



Hyperbola 



i-kjh 








(e) 



/- 

HypcitKJlii 




m 



Parabola 




Jfe>0 



t < 






(*> 



Parabola 




Jt>0 



*<<> 



Figure 9.6.1 



EXAMPLE 4 Three Conies 



From Figure 9.6.1, the equation 

.2 -.2 

A ' 9 

matches the form of an ellipse with k = 2 an d 1=3- Thus the ellipse is in standard position, intersecting the x-axis at (-2, 0) 
and (2, 0) and intersecting the y-axis at (0, -3) and (0, 3). 



4 + ^ = i 



The equation x 2 — Sy 2 = — 16 can be rewritten as y 2 / 2 — x 2 / 16 = 1, which is of the form y 2 / k 2 — x 2 / 1 2 = 1 with k= J% 
I = 4. Its graph is thus a hyperbola in standard position intersecting the y-axis at (0, — ^2) and (Q, J2). 

The equation 5* I 2y = can be rewritten as * = — -^y, which is of the form x = ky with k = — ^. Since k < 0. its graph 
is a parabola in standard position opening downward. 



Significance of the Cross-Product Term 

Observe that no conic in standard position has an *y-term (that is, a cross-product term) in its equation; the presence of an xy 
-term in the equation of a nondegenerate conic indicates that the conic is rotated out of standard position (Figure 9.6.2a).Also, 
no conic in standard position has both an x 2 and an x term or both ay and a y term. If there is no cross-product term, the 

occurrence of either of these pairs in the equation of a nondegenerate conic indicates that the conic is translated out of standard 
position (Figure 9.6.2b). The occurrence of either of these pairs and a cross-product term usually indicates that the conic is both 
rotated and translated out of standard position (Figure 9.6.2c). 









ffl) Rotated 
Figure 9.6.2 
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One technique for identifying the graph of a nondegenerate conic that is not in standard position consists of rotating and 
translating the *y-coordinate axes to obtain an x V '-coordinate system relative to which the conic is in standard position. Once 

this is done, the equation of the conic in the *V'-system will have one of the forms given in Figure 9.6.1 and can then easily be 

identified. 



EXAMPLE 5 Completing the Square and Translating 



Since the quadratic equation 



2* 2 I y 2 -Ux-4y + 18 = 



contains ^ 2 -, x-, y -, and y-terms but no cross-product term, its graph is a conic that is translated out of standard position but 

not rotated. This conic can be brought into standard position by suitably translating coordinate axes. To do this, first collect 
x- terms and y-terms. This yields 



(2x 2 -12*) I O 2 -4>0-hlS = or 2(x 2 -6x) I (y 2 -4y) = 
By completing the squares* on the two expressions in parentheses, we obtain 

2(x 2 -6x + 9)-4- (y 2 -4y + 4)= -13+18 + 4 
or 



-18 



2(x-3) 2 I (.y-2) 2 = 4 



(3) 



If we translate the coordinate axes by means of the translation equations 

x f = x-3, y f =y-2 
then 3 becomes 



2*' 2 +/ 2 = 4 or ± 



A 



I 



= 1 



ft 



which is the equation of an ellipse in standard position in the x y -system. This ellipse is sketched in Figure 9.6.3. 




Figure 9.6.3 



2 4 



Eliminating the Cross-Product Term 



We shall now show how to identify conies that are rotated out of standard position. If we omit the brackets on \ x 1 matrices, 
then 2 can be written in the matrix form 



i x y] 



a b 
b c 



+ [d e] 



+ / = 



or 



where 



x s Ax + Kx + f = 





~x~ 




a b 


x = 


y 


, A = 


b c_ 



, K=[d e] 



Now consider a conic C whose equation in ^-coordinates is 



x s Ax + Kx + f =0 



(4) 



J„f 



We would like to rotate the ^-coordinate axes so that the equation of the conic in the new x y -coordinate system has no 
cross-product term. This can be done as follows. 



Step 1. Find a matrix 



p= 



Pll Pll 
P2\ P22 



that orthogonally diagonalizes the matrix A. 



Step 2. Interchange the columns of P, if necessary, to make det(jP) = 1- This ensures that the orthogonal coordinate 
transformation 



x = Px, that is 



n-p*' 

/J [y 



is a rotation. 



(5) 



/ / 



Step 3. To obtain the equation for C in the x y -system, substitute 5 into 4. This yields 



.^„ /DL . , 



(Px f ) A(Px f ) + K(Px f ) + / = 



or 



J-,T,nT AU -,.J 



(x') {P T AP)\ ! I (KF)x f + f=Q 



(6) 



Since P orthogonally diagonalizes A, 



P T AP = 



Ai 0' 
A 2 



where Ai and \ 2 are eigenvalues of A. Thus 6 can be rewritten as 



[Ai 0] 


\ x '] 




[PU P12~\ 


\x'] 


A 2 


y ! _ 


1 [rf e] 


P2\ P22 


y'_ 



+ / = o 



or 



A^' 2 + A27' 2 + ^^'+eV' + / = 



(where d f = dp n l e?2i an d ^' = ^12 + ^22)- Tnis equation has no cross-product term. 
The following theorem summarizes this discussion. 



THEOREM 9.6.2 



Principal Axes Theorem for R £ 

Let 

be the equation of a conic C, an 

be the associated quadratic forr 
for C in the new *V-coordinate 

where Ai and \ 2 are the eigenva 










so that the equation 
1 by the substitution 


ax 2 + 2bxy \ cy 2 \ dx \ ey + ./ = 

d let 

x T Ax = ax 2 + 2bxy + cy 2 

n. Then the coordinate axes can 
system has the form 


be rotated 
complishec 


\,x' 2 + \iy ,2 + d'x t +e'y f + f~- 
lues of A. The rotation can be 


= 

ac 



where P orthogonally diagonalizes A and det(jP) = 1- 



EXAMPLE 6 Eliminating the Cross-Product Term 



Describe the conic C whose equation is 5x — Axy I 8y —36 = 0. 



Solution 



The matrix form of this equation is 



where 



x r ,4jE-36 = 



(7) 



A = 



5 -2 
-2 8 



The characteristic equation of A is 



det(A/ -A) = det 



X-5 2 

2 A-8 



= (A-9)(A-4) = 



so the eigenvalues of A are A = 4 and A = 9- We leave it for the reader to show that orthonormal bases for the eigenspaces are 



A = 4: vi = 



2/^5 



A = 9: v 2 = 



-If {5 



Thus 



P = 



2l{5 -1//5 
1//5 2//5 
orthogonally diagonalizes A. Moreover, det(,P) = 1» an d thus the orthogonal coordinate transformation 

x = Px' 

is a rotation. Substituting 8 into 7 yields 

(Px') T A(Px') -36 = or (x') T (P T AP)x' - 36 = 
Since 



(8) 



P T AP = 



4 
9 



this equation can be written as 



[*' A 



4 
9 



-36 = 



or 



,2 t2 



Ax' +9y' -36 = or -*— + ^— = 1 

y 4 

which is the equation of the ellipse sketched in Figure 9.6.4. In that figure, the vectors vj and \-2 are the column vectors of 
P — that is, the eigenvectors of A. 



jiy 




Figure 9.6.4 
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EXAMPLE 7 Eliminating the Cross-Product Term Plus Translation 



Describe the conic C whose equation is 



5x 2 _ Axy + Sy 2 -\ A fix - 16fiy + 4 = 



Solution 

The matrix form of this equation is 

where 

As shown in Example 6, 



jt r .At + £jc + 4 = 



,4 = 



5 -2 
2 8 



and £=[4/5-16/5] 



P = 



2 


1 


1 
/5 


2 



orthogonally diagonalizes A and has determinant 1. Substituting x — p x f into 9 gives 

(Pi') ^(Px') I K(Px f ) +4 = 
or 

(x') T (P T AP)x f + (XP)x' + 4 = 

Since 



P 7 AP = 



A 
9 



and KP = 



20 _ 30 
/5 '/? 



2 


1 


1 
/5 


2 



= [-8 -36] 



(9) 



(10) 



10 can be written as 



A 



J 2 o_/ 



Ax 1 I 9/ -Sx'- 36/ + 4 = 



(ID 



J^J 



To bring the conic into standard position, the x'y axes must be translated. Proceeding as in Example 5, we rewrite 1 1 as 
Completing the squares yields 



4(*' 2 -2*') I 9(y 2 -4/)= -4 
4(*' 2 -2*' I 1) I 9(y f2 -4y f I 4)= -4 + 4 + 36 



or 



4(*'-l) I 9(>'-2) =36 

If we translate the coordinate axes by means of the translation equations 

*» = *'-! y»=y'-2 

then 12 becomes 



(12) 



Jfl , n.," 2 



Jf 2 „" 2 



4*"* + 9y"* = 36 or ^— |^— = 1 

which is the equation of the ellipse sketched in Figure 9.6.5. In that figure, the vectors vj and V2 are the column vectors of P 
— that is, the eigenvectors of A. 




Figure 9.6.5 
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Exercise Set 9.6 



O 



Click here for Just Ask! 



In each part, find a change of variables that reduces the quadratic form to a sum or difference of squares, and express the 
quadratic form in terms of the new variables. 



( a ) 2* J I 2*2-2*1*2 



( b ) 5*^ I 2*| + 4*i*2 



(c) 2*1*2 

(d) -3* 2 I 5*| I 2 *1*2 

In each part, find a change of variables that reduces the quadratic form to a sum or difference of squares, and express the 
2. quadratic form in terms of the new variables. 

( a ) 3* 2 I 4* 2 I 5*f I 4*1*2-4*2*3 

( b ) 2* 2 I 5*2 I 5*| I 4*1*2-4*1*3-8*2*3 

( c ) -5* 2 i x\-x\ i 6*1*3 i 4*1*2 

(d) 2*1*3 + 6*2*3 

Find the quadratic forms associated with the following quadratic equations. 

(a) 2* 2 -3*y I Ay 2 -7* I 2^ + 7 = 

(b) x 2 - xy + 5* -I 87 - 3 = 

(c) 5*7 = 8 

(d) 4* 2 -2^ 2 = 7 

(e) y 2 1 7* -87 -5 = 



Find the matrices of the quadratic forms in Exercise 3. 
4. 



Express each of the quadratic equations in Exercise 3 in the matrix form x ^ , j^ x , j- _ q 

5. ^ = = — 



3. 



Name the following conies. 
6. 

(a) 2* 2 4- 5,y 2 = 20 



(b) Ax 2 I 9y 2 =l 

(c) x 2 - y 2 -B = 

(d) 4y 2 -5x 2 = 20 

(e) x 2 | y 2 - 25 = 

(f) 7y 2 -2x = 

(g) -x 2 = 2.y 
(h) 3x-li7 2 = 

(i) y-x 2 = 

(j) x 2 -3= -y 2 

In each part, a translation will put the conic in standard position. Name the conic and give its equation in the translated 
7. coordinate system. 

(a) 9x 2 I Ay 2 - 36* - 2Ay + 36 = 

(b) x 2 -16.y 2 I Sx I 128,y = 256 

(c) y 2 -Sx-\Ay |49 = 

(d) x 2 I y 2 I 6x-10.y+13 = 

(e) 2x 2 -3.y 2 1 6* | 20^= -41 

(f) x 2 I \0x \ ly= -32 



The following nondegenerate conies are rotated out of standard position. In each part, rotate the coordinate axes to remove 
8. the ;ry-term. Name the conic and give its equation in the rotated coordinate system. 



(a) 2x 2 -Axy-y 2 + S = 

(b) 5x 2 I Axy I 5y 2 = 9 

(c) Ux 2 I 2Axy I 4y 2 -15 = 

In Exercises 9-14 translate and rotate the coordinate axes, if necessary, to put the conic in standard position. Name the conic 
and give its equation in the final coordinate system. 

9x 2 - Axy I 6y 2 - \0x - 20y = 5 



10. 



11. 



3 X 2 _ Bxy -\2y 2 - 3dx - 64y = 



2x 2 - Axy -y 2 - Ax - %y = - 14 



2U 2 + 6xy + Uy 2 -UAx I 34.y -f 73 = 
12. 



x 2 -6xy-ly 2 I 10x + 2y + 9 = 
13. 



14. 



Ax 2 -20xy I 25.y 2 -15x-6.y = 



The graph of a quadratic equation in x and y can, in certain cases, be a point, a line, or a pair of lines. These are called 
15. degenerate conies. It is also possible that the equation is not satisfied by any real values of x and y. In such cases the 
equation has no graph; it is said to represent an imaginary conic. Each of the following represents a degenerate or 
imaginary conic. Where possible, sketch the graph. 



(a) x 2 -y 2 = 

(b) x 2 I 3y 2 + l = 

(c) Sx 2 \ly 2 = 

(d) x 2 -2xy + y 2 = 



(e) 9x 2 I 12*y I 4y 2 - 52 = 

(f) * 2 | y 2_2x-4y= -5 



Prove: If £. ^ 0. then the cross-product term can be eliminated from the quadratic form ax 2 \ 2bxy I cy 2 by rotating the 

1 (\ 

10# coordinate axes through an angle 6 that satisfies the equation 



cot 29 = 



a — c 
2b 
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In this section we shall apply the diagonalization techniques developed in 
the preceding section to quadratic equations in three variables, and we 
QU AD RIC SU R FACES shall use our results to study quadric surfaces. 



9.7 



In Section 9.6 we looked at quadratic equations in two variables. 



Quadric Surfaces 

An equation of the form 



2 2 2 

ax + by +cz +2dxy + 2exz + 2fyz + gx + ky + iz + j = 



(1) 



where a, b, . . .,/are not all zero, is called a quadratic equation in x, y, and z\ the expression 

ax + by + cz + 2dxy + 2exz + 2fyz 
is called the associated quadratic form, which now involves three variables: x, y, and z- 

Equation 1 can be written in the matrix form 



[* 



or 



where 



x = 



A 



a d e 
d b f 


~x~ 

y 


e f c 


z 



[g 



x s Ax + Kx + j = 



i] 



+ J = 



y 


, A = 


a d e 
d h f 


z 




e f c 



K=[g h i] 



EXAMPLE 1 Associated Quadratic Form 



The quadratic form associated with the quadratic equation 



is 



3x 2 -\ 2y 2 -z 2 i 4xy \ 3xz -Syz + 7x -\- 2y -\- 3z -1 = 



3x 2 I 2y 2 -z 2 | 4xy \ 3xz-Syz 



Graphs of quadratic equations in x, y, and z are called quadrics or quadric surfaces. The simplest equations for quadric 
surfaces occur when those surfaces are placed in certain standard positions relative to the coordinate axes. Figure 9.7.1 
shows the six basic quadric surfaces and the equations for those surfaces when the surfaces are in the standard positions 
shown in the figure. If a quadric surface is cut by a plane, then the curve of intersection is called the trace of the plane on the 
surface. To help visualize the quadric surfaces in Figure 9.7.1, we have shown and described the traces made by planes 
parallel to the coordinate planes. The presence of one or more of the cross-product terms xy\, ^ z , and yz in the equation of a 



quadric indicates that the quadric is rotated out of standard position; the presence of both x 2 and x terms, y and y terms, or 
z 2 and z terms in a quadric with no cross-product term indicates the quadric is translated out of standard position. 



Surface 



Ellipsoid 




■p^* v 



Efftiuiiort 



jp ;r jp* 

-T + ^ + — = ' 
/- m~ tr 

T!s^ ttiu^s in Hie t-uuftJinalc 
pJ.iii^s ;sre ellipse*, us an.* the 
;ivK - L'i in lh*ise ptuiies lh;bt are 

parallel io the coordinate p hi ne^ 
and intersect the surface in more 



Surface 



i -II pi .. ...- ■. 




Equation 



2 n 



P 



*r 



Thii? trace in ihe.vy-pJarLe is a 

point (the tin y in), an J Lite iraee* 
in planes parallel m then-plane 
are ellipses. The traces in Live tz- 
and .is-phnes are pairs of Lines 
jnltfrseeLin-| Jl (he origin, ihu 
traces iti platies, viiiulk'l tu Iheie 
:irc hyperbola*. 



HypcrholokE 
of one sheet 




v 2 ,.l _2 

— + 2 — = i 

,11 ■< 
/- /ji- ii - 

The Iracc in the .iT-plane f* an 
ellipse, as are (he traces in 
pEancs parallel To The (v-planc, 
The (races in the y;*planca,nd 
the A--planc arc hyperbolas, as 
nrc ihc (races in ihose phnrs rhut 
arc parallet Go these and dn rn.it 
pw$ through she ».- or v- 
interceprs. Autocar interccpis 
[he traces are pairs nl 
incerseciinii Sines. 



Iliipii.. pur^uhm] 




The trace in ihc n-plartc is a 
point Mhe origin), and the traces 

in planes parallel Io and above 
the .n-planc ftfiS cSlip.scs. The 
fr&ecs in the v:- and cr-planes 
Tire parabola a* Jire the liracus 

in plane* parallel to these, 



Hypcrboloid 
uf tfru sheets 


i 


Z 
i 














y^ 1 *^**) 


1 J&fiwftlttrf??- 


mmw 






:- 



Hyperbolic paraboloid 



in' 



m- 



T 7 



There is no trace in the .< v-ptonc 
In planes para I Lei to the n- plane 
thai intersect the surface: in more 
than one point, ihc Entires arc 
ellipses. In ilicv-- and rr- planes, 
the traccH are hyperbola, IS sire 
the traces tn those planes J hat are 
paraMet tn these and intersect she 
surface in more than one point. 




The trace in Ihe^pFiirt is a 

pa if Of litUBS inlerNCGlin£ ;ii |he 
oriuin. The traces in planes 
parallel to the .vivplane are 
hyperbolas. The hyperbola* 
above ihc .vv-plane open in ilic 
v-dircuiiiHi, and ihusc heluw iei 
lhe f-LhrcLlfi-sn. I he loco in I he 
_v^' and .^pbnes anr pambtilas, 
as are ilie trjeet; in pla,n.us 
parallel to these, 



Figure 9.7.1 



EXAMPLE 2 Identifying a Quadric Surface 



Describe the quadric surface whose equation is 



4* 2 I 36y 2 -9z 2 -\6x-2\6y I 304 = 



Solution 

Rearranging terms gives 
Completing the squares yields 
or 
or 



4(x 2 -4;0 I 36(y 2 -6y)-9z 2 = -304 



4(* 2 -4* + 4)4 360 2 -6y i 9) - 9z 2 = - 304 + 16 4 324 



4(x-2) 2 I 36(>-3) 2 -9z 2 = 36 



9 
Translating the axes by means of the translation equations 



i^H lCy _3) 2 -^ = l 



x' = x-2, y'=y-3, 



Z — Z 



yields 



9 ^ 4 



which is the equation of a hyperboloid of one sheet. 



Eliminating Cross- Product Terms 



The procedure for identifying quadrics that are rotated out of standard position is similar to the procedure for conies. Let Q 
be a quadric surface whose equation in * ^-coordinates is 



x i Ax + Kx-\-j = 



(2) 



~f~J-f 



We want to rotate the *yz-coordinate axes so that the equation of the quadric in the new x y z -coordinate system has no 
cross-product terms. This can be done as follows: 



Step 1. Find a matrix P that orthogonally diagonalizes x T Ax- 



Step 2. Interchange two columns of P, if necessary, to make det (P) = ]. This ensures that the orthogonal coordinate 
transformation 



x = Px f , that is, 



y =p y 



(3) 



is a rotation. 



,■'-'■' 



Step 3. Substitute 3 into 2. This will produce an equation for the quadric in x y z -coordinates with no cross-product 
terms. (The proof is similar to that for conies and is left as an exercise.) 



The following theorem summarizes this discussion. 



THEOREM 9.7.1 



Principal Axes Theorem for ff 3 

Let 



2 2 2 

ax 4- by + cz 4 2d?*y I 2exz 4- 2fyz + gx + ky + iz + j = Q 

be the equation of a quadric Q,and let 

x Ax = ax +by +cz 4- 2dxy 4- 2exz 4- 2fyz 

be the associated quadratic form. The coordinate axes can be rotated so that the equation of Q 
in the ^Vz'-coordinate system has the form 

where Ai, A 2 , and a 3 are the eigenvalues of A. The rotation can be accomplished by the 
substitution 

x = Px* 
where P orthogonally diagonalizes A and det(F) = 1. 



EXAMPLE 3 Eliminating Cross-Product Terms 



Describe the quadric surface whose equation is 



4x 2 + Ay 2 + 4z 2 + Axy 4 4xz -I 4yz -3 = 



Solution 

The matrix form of the above quadratic equation is 



xMx-3 = Q 



(4) 



where 



A = 



All 
2 4 2 
2 2 4 



As shown in Example 1 of Section 7.3, the eigenvalues of A are \ = 2 and \ = 8, and A is orthogonally diagonalized by the 
matrix 



P = 



-1//2 -\i{e 1/^3 

1//2 -If ^6 1//3 
2/^2 1//3 



where the first two column vectors in P are eigenvectors corresponding to \ — 2, and the third column vector is an 
eigenvector corresponding to A = 8- 

Since det(P) — 1 (verify), the orthogonal coordinate transformation x = p x * is a rotation. Substituting this expression in 4 
yields 



or, equivalently, 



But 



so 5 becomes 



or 



This can be rewritten as 



which is the equation of an ellipsoid. 



(Px f ) T A(Px f ) -3 = 



(x f ) T (P T APW -3 = 



P T AP = 



'2 





0" 





2 











8 



[*' y 1 A 



"2 0" 


V" 


2 


/ 


8 


z' 



-3 = 



2x< 2 I 2y' 2 I 8z' 2 = 3 



+ ^7rr = l 



3/2 3/2 3/8 



(5) 



Exercise Set 9.7 



O 



Click here for Just Ask! 



Find the quadratic forms associated with the following quadratic equations. 



J. , o..2 _2 



(a) x z I 2y z -z z \ Axy - 5yz I Ix I 2z= 3 



.2 , -j_2 



(b) 3;T +7z -I 2xy-3xz I 4yz-3x=4 



(c) 7:7 I xz + yz=\ 

(d) * 2 ,.y 2 -z 2 = 7 

(e) 3z 2 I 3xz-l4y + 9 = 



(f) 2z 2 I 2xz \ y 2 I 2x-y I 3z = 



Find the matrices of the quadratic forms in Exercise 1 . 

2. 

Express each of the quadratic equations given in Exercise 1 in the matrix form x T Ax + Kx + j = 0- 
3. 

Name the following quadrics. 
4. 

(a) 36x 2 I 9y 2 I 4z 2 - 36 = 

(b) 2x 2 I 6y 2 -3z 2 = \S 

(c) 6x 2 -3y 2 -2z 2 -6 = 

(d) 9x 2 I 4y 2 -z 2 = 

(e) 16x 2 +y 2 = \6z 

(f) lx 2 -3y 2 +z = 

(g) x 2 I 7 2 fz 2 = 25 

In Exercise 4, identify the trace in the plane z — 1 m each case. 

5. 

Find the matrices of the quadratic forms in Exercise 4. Express each of the quadratic equations in the matrix form 
6 - x r ,4x I iTx + j = 0- 

In each part, determine the translation equations that will put the quadric in standard position, and find the equation of tl 
7. quadric in the translated coordinate system. Name the quadric. 

(a) 9x 2 I 36y 2 I 4z 2 - \Sx - 144y - 24z I 153 = 

(b) 6x 2 I 3^ 2 -2z 2 I 12*-13y-8z= -7 

(c) 3* 2 -3.y 2 -z 2 i 42* I 144 = 

(d) 4* 2 I 9y 2 -z 2 -54^-50z=544 



(e) x 2 I \6y 2 I 2x-32y-\6z-\5 = Q 

(f) 7x 2 -3.y 2 i 126* i 72,y+z+ 135 = 

(g) * 2 I y 2 I z 2 -2x i 4y -6z= 11 

In each part, find a rotation x = Px' that removes the cross-product terms, and give its equation in the x'y'z' -system. 
Name the quadric. 

(a) 2x 2 | 3.y 2 | 23z 2 | Tlxz | 150 = 

(b) Ax 2 V 4y 2 + Az 2 + Axy + Axz -\ Ayz-5 = 

(c) 144* 2 I 100^ 2 I 81z 2 -216*z-540x-720z=0 

(d) 2xy+z = 

In Exercises 7-10 translate and rotate the coordinate axes to put the quadric in standard position. Name the quadric and give 
its equation in the final coordinate system. 

2xy 4- 2xz 4- 2yz — 6x — 6y — Az= —9 
9. 

7* 2 -I- 7y 2 + 10z 2 - 2xy - Axz I Ayz - 12x I 12y I 60z = 24 
10. 

2xy-6x I \0y I z-31 = 
11. 

2* 2 -I- 2^ 2 + 5z 2 - Axy - 2xz I 2yz I lOx - 26y - 2z = 
12. 

Prove Theorem 9.7.1. 
13. 
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" "O In this section we shall discuss some practical aspects of solving systems of 

COM PARISON OF linear equations, inverting matrices, and finding eigenvalues. Although we have 

PROCEDURES FOR previously discussed methods for performing these computations, we now 

cm \/TMr tmpad consider their suitability for the computer solution of the large-scale problems 

bOLVlNb LINEAR that arise in real-world applications. 
SYSTEMS 



Counting Operations 

Since computers are limited in the number of decimal places they can carry, they round off or truncate most numerical quantities. 
For example, a computer designed to store eight decimal places might record ^ as either .66666667 (rounded off) or .66666666 
(truncated).* In either case, an error is introduced that we shall call roundoff error or rounding error. 

The main practical considerations in solving linear algebra problems on digital computers are minimizing the computer time (and 
thus cost) needed to obtain the solution, and minimizing inaccuracies due to roundoff errors. Thus, a good computer algorithm 
uses as few operations and memory accesses as possible, and performs the operations in a way that minimizes the effect of 
roundoff errors. 

In this text we have studied four methods for solving a linear system, jjx = b> of n equations in n unknowns: 
1. Gaussian elimination with back-substitution 



2. Gauss-Jordan elimination 



3. Computing ^ _1 , then forming x = ^4 _1 ti 



4. Cramer's rule 



To determine how these methods compare as computational tools, we need to know how many arithmetic operations each requires. 
It is usual to group divisions and multiplications together and to group additions and subtractions together. Divisions and 
multiplications are considerably slower than additions and subtractions, in general. We shall refer to either multiplications or 
divisions as "multiplications" and to additions or subtractions as "additions." 

In Table 1 we list the number of operations required to solve a linear system Ax = b of n equations in n unknowns by each of the 
four methods discussed in the text, as well as the number of operations required to invert A or to compute its determinant by row 
reduction. 

Table 1 

Operation Counts for an Invertible n x n Matrix A 



Method Number of Additions Number of Multiplications 



Solve Ax: = h by Gauss- Jordan elimination ! M 3 | ! M 2 _ 5.^ J_ M 3 _^ ^2 _ J_ M 

3 2 6 3 '" 3 



Method 



Number of Additions Number of Multiplications 



Solve Ax = h by Gaussian elimination 
Find ^-1 by reducing [A |7] to [/ | ^4 _1 ] 



\»\ i„2-§„ 



3 2 

n —2n -\-n 



\n 2 + n 2 -\n 



Solve J jx = has x = J d" 1 b 



3 2 
n — n 



3 2 



Find det(A) by row reduction 



Solve Ax = h by Cramer's rule 



¥-¥ 2 +h 



i« 3 i|«-i 



I„4_ lj _ 1,2 , 1 14 , 13 + 2 2 + 2 _ 1 

36363333 



Note that the text methods of Gauss-Jordan elimination and Gaussian elimination have the same operation counts. It is not hard to 
see why this is so. Both methods begin by reducing the augmented matrix to row-echelon form. This is called the forward phase or 
forward pass. Then the solution is completed by back-substitution in Gaussian elimination and by continued reduction to reduced 
row-echelon form in Gauss-Jordan elimination. This is called the backward phase or backward pass . It turns out that the number 
of operations required for the backward phase is the same whether one uses back-substitution or continued reduction to reduced 
row-echelon form. Thus the text method of Gaussian elimination and the text method of Gauss-Jordan elimination have the same 
operation counts. 

Remark There is a common variation of Gauss-Jordan elimination that is less efficient than the one presented in this text. In our 
method the augmented matrix is first reduced to reduced row-echelon form by introducing zeros below the leading l's; then the 
reduction is completed by introducing zeros above the leading Ts. An alternative procedure is to introduce zeros above and below 
a leading 1 as soon as it is obtained. This method requires 



2 



addition and 



multiplications 



both of which are larger than our values for all n > 3. 



To illustrate how the results in Table 1 are computed, we shall derive the operation counts for Gauss-Jordan elimination. For this 
discussion we need the following formulas for the sum of the first n positive integers and the sum of the squares of the first n 
positive integers: 



1+2+3+ -+m= 



n {n + 1 ) 



(1) 



l 2 + 2 2 + 3 2 + ^ + * 2 = "C" + D(*i + 1) 

6 



(2) 



Derivations of these formulas are discussed in the exercises. We also need formulas for the sum of the first n _ \ positive integers 
and the sum of the squares of the first n — \ positive integers. These can be obtained by substituting n — 1 for n in 1 and 2. 



1 + 2 + 3 + -+ («-!) = 



(n — Y)n 



(3) 



l2 + 2 2 + 3 2 + ,„ +( ^_ 1) 2^ («-lM2»-l) 

6 



(4) 



Operation Count for Gauss-Jordan Elimination 

Let As. = b be a system of n linear equations in n unknowns, and assume that A is invertible, so that the system has a unique 
solution. Also assume, for simplicity, that no row interchanges are required to put the augmented matrix [A \h] in reduced 



row-echelon form. This assumption is justified by the fact that row interchanges are performed as bookkeeping operations on a 
computer (that is, they are simulated, not actually performed) and so require much less time than arithmetic operations. 



Since no row interchanges are required, the first step in the Gauss-Jordan elimination process is to introduce a leading 1 in the first 
row by multiplying the elements in that row by the reciprocal of the leftmost entry in the row. We shall represent this step 
schematically as follows: 



X X 

• * 



X x 



* demtfcs j nu-miiiy ihm is not Rkntptitoti 



Note that the leading 1 is simply recorded and requires no computation; only the remaining n entries in the first row must be 
computed. 

The following is a schematic description of the steps and the number of operations required to reduce [A | b] to row-echelon form. 
Stepl. 



Step la. 



Step 2. 
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Step 2a. 



Step 3. 



Step 3a. 



Step (m-1). 



Step ( M _ 1)a. 
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ft — I additions/ row 

u — 2 rows requiring computations 
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n — 2 multiplications/row 

n — 2 additions/ row 

n — 3 rows requiring computations 
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Thus, the number of operations required to complete successive steps is as follows: 



Steps 1 and la. 



Multiplications: 
Additions: 



a + k(k — 1) =n 
ft(fl — 1) =tt — fl 



Steps 2 and 2a. 



Steps 3 and 3a. 



Steps ( n - 1) and ( n _ 1) a. 



Multiplications: (fl-1) I (n - \){n - 2) = (n - \y 
Additions: (n - 1) (n - 2) = (n - l) 2 -(n-\) 



Multiplications: {n - 2) I {n - 2) {n-2) = {n-2y 
Additions: (n -2){n- 3) = = (« - 2) 2 - (m - 2) 



Multiplications: 4( = 2^) 



Additions: 



2( = 2^ - 2) 



Step n. 



Multiplic ations : 1 ( = 1 ) 



Additions: 



0( = r-l) 



Therefore, the total number of operations required to reduce [A | b] to row-echelon form is 

Multiplications: a 2 I (tf-1) 2 I (tf-2) 2 I hi 2 



Additions: 
or, on applying Formulas 1 and 2, 



[* 2 I (*-l) 2 i (*-2) 2 I - I l 2 ]-[* | (fl-1) i (m-2) h-+l] 



■**■ 1+ - r *■ «(« I l)(2fl I 1) a 3 . n 2 . n 

Multiplications: — ^ — = —- + -— + — 

6 3 2 6 



(5) 



Additions: 



n(n-\- l)(2w + 1) n(n-\- 1) 



h h 



(6) 



This completes the operation count for the forward phase. For the backward phase we must put the row-echelon form of [A | b] 
into reduced row-echelon form by introducing zeros above the leading l's. The operations are as follows: 



Step 1. 



Step 2. 



Step (m-2). 



Step (m-1). 
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Thus, the number of operations required for the backward phase is 
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or, on applying Formula 3, 



Multiplications: 
Additions: 



Multiplications: 



(m-1) | ( M _2)+-» + 2+l 
(m-1) | ( M _2)+-» + 2+l 



(« — !)« 



« « 



Additions: i^ 1 ^ = >£ - | 
Thus, from 5, 6, 7, and 8, the total operation count for Gauss-Jordan elimination is 



(7) 
(8) 



Multiplications: 



^_ i ^_ _i_ h 

\ 3 2 ~ 6 i 



I 



f 2 \ 3 

tf tf ft . 2 tf 

ri, = i""3 c> 



Addons: (£-*) , (£-| )_£ + £-* (10) 

Comparison of Methods for Solving Linear Systems 

In practical applications it is common to encounter linear systems with thousands of equations in thousands of unknowns. Thus we 
shall be interested in Table 1 for large values of n. It is a fact about polynomials that for large values of the variable, a polynomial 
can be approximated well by its term of highest degree; that is, if a ^ ^ Q, then 

ftp + a ix H 1- aft* asflfc* for large* 

(Exercise 12). Thus, for large values of n, the operation counts in Table 1 can be approximated as shown in Table 2. 

Table 2 

Approximate Operation Counts for an Invertible HX « Matrix A for Large n 
Method Number of Additions Number of Multiplications 



Solve Ax = b by Gauss-Jordan elimination «- w 



1 si 

3 ~ 3 



Solve j4x = b by Gaussian elimination M - _ fl 



3 li 

3 



Find ,4 - 1 by reducing [^ | /] to [/ | .4" 1 ] ~ * 3 ~ ^ 3 

Solve ^4x = b as x _ a -^ 



3 3 



Find det(j4) by row reduction «- ^ 



1 ~Z 

3 ~ 3 



Solve j4x = b by Cramer's rule M « 



1 ^«1 

3 ~ 3 



It follows from Table 2 that for large n, the best of these methods for solving Ax = b are Gaussian elimination and Gauss-Jordan 
elimination. The method of multiplying by j-1 is much worse than these (it requires three times as many operations), and the 
poorest of the four methods is Cramer's rule. 

Remark 

We observed in the remark following Table 1 that if Gauss-Jordan elimination is performed by introducing zeros above and 
below leading l's as soon as they are obtained, then the operation count is 

3 3 2 

^r- — 7j additions and ^- + ^- multiplications 

Thus, for large n, this procedure requires ^^p f 2 multiplications, which is 50% greater than the ^ / 3 multiplications required by 
the text method. Similarly for additions. 

It is reasonable to ask if it is possible to devise other methods for solving linear systems that might require significantly fewer than 
the py ft 3 / 3 additions and multiplications needed in Gaussian elimination and Gauss-Jordan elimination. The answer is a qualified 
"yes." In recent years, methods have been devised that require as Oft** multiplications, where q is slightly larger than 2.3. However, 
these methods have little practical value because the programming is complicated, the constant C is very large, and the number of 



additions required is excessive. In short, there is currently no practical method for the direct solution of general linear systems that 
significantly improves on the operation counts for Gaussian elimination and the text method of Gauss-Jordan elimination. 

Operation counts are not the only criterion by which to judge a method for the computer solution of a linear system. As the speed 
of computers has increased, the time it takes to move entries of the matrix from memory to the processing unit has become 
increasingly important. For very large matrices, the time for memory accesses greatly exceeds the time required to do the actual 
computations! Despite this, the conclusion above still stands: Except for extremely large matrices, Gaussian elimination or a 
variant thereof is nearly always the method of choice for solving Ax = h- It is almost never necessary to compute ^4 _1 , and we 
should avoid doing so whenever possible. Solving Ax = h by Cramer's rule would be senseless for numerical purposes, despite its 
theoretical value. 



EXAMPLE 1 Avoiding the Inverse 



Suppose we needed to compute the product AB _1 Cx- The result is a vector y. Rather than computing y = A(B 1 (Cx) ) as given, it 
would be more efficient to write this as 

z = 5 _1 Cx 

y = Az 

that is, as 

Bz=Cx 
y = Az 

and to compute the result as follows: First, compute the vector w — Cx, second, solve Bz = w f° r Z using Gaussian elimination; 
third, compute the vector v = Az. 

♦ 

For extremely large matrices, such as the ones that occur in numerical weather prediction, approximate methods for solving Ax = h 
are often employed. In such cases, the matrix is typically sparse; that is, it has very few nonzero entries. These techniques are 
beyond the scope of this text. 



Exercise Set 9.8 



Click here for Just Ask! 



1. 



2. 



Find the number of additions and multiplications required to compute AB if A is an m x n matrix and B is an m x p matrix. 



Use the result in Exercise 1 to find the number of additions and multiplications required to compute A k by direct multiplication 
if A is an w x n matrix. 



Assuming A to be an n x n matrix, use the formulas in Table 1 to determine the number of operations required for the 
3. procedures in Table 3. 



Table 3 












n = 5 


« = 10 


w = 100 


« = 1000 



+ X + X + X 



Solve Ax = h by Gauss- Jordan 
elimination 



Solve Ax = h by Gaussian 
elimination 



Find j4 _1 by reducing [A\I] to 
[l\A~ l ] 

Solve^x^basx^- 1 ^ 



Find det(jl) by row reduction 



Solve Ax = h by Cramer's rule 



Assuming for simplicity a computer execution time of 2.0 microseconds for multiplications and 0.5 microsecond for additions, 
4. use the results in Exercise 3 to fill in the execution times in seconds for the procedures in Table 4. 

Table 4 



n = 5 



w = 10 



« = 100 



« = 1000 



Solve Ax = h by Gauss- Jordan 
elimination 

Solve Ax = h by Gaussian 
elimination 



Execution Time Execution Time Execution Time Execution Time 

(sec) (sec) (sec) (sec) 



Find j4 _1 by reducing [A\I] to 
[l\A~ l ] 

Solve jiK = b as x = J 4" 1 h 

Find det(j4) by row reduction 



Solve Ax = h by Cramer's rule 



Derive the formula 



5. 



1+2+3+ +*= 



k(k+ 1) 



Hint Let s n = 1 + 2 + 3 H h «• Write the terms of s n in reverse order and add the two expressions for £ . 



Use the result in Exercise 5 to show that 



Derive the formula 
7. 



l2 + 2 2 + 3 2 + - + * 2 = * ( * ' 1) ^ 1 !) 



using the following steps. 

(a) Show that (jt I \) 3 -k 3 = 3k 2 h3i+l. 



(b) Show that 



[2 3 -l 3 ] \ [3 3 -2 3 ] I [4 3 -3 3 ] -h-H- [(«-h l) 3 -^ 3 ] = (^H- l) 3 - 1 



(c) Apply (a) to each term on the left side of (b) to show that 

(* + l) 3 -l=3[l 2 f 2 2 + 3 2 + - + * 2 l+3n + 2 + 3 + - + *l+* 

(d) Solve the equation in (c) for ] 2 + 2 2 + 3 2 H h tf 2 > use the result of Exercise 5, and then simplify. 



Use the result in Exercise 7 to show that 
8 * l 2 + 2 2 + 3 2 + -+(«-l) 2 = («-W2"-D 

Let R be a row-echelon form of an invertible HX « matrix. Show that solving the linear system Rx = b by back-substitution 
9. 



requires 



2 2 

^■r- — ^ multiplications and ^ — ^ additions 



Show that to reduce an invertible HX « matrix to / by the text method requires 
10. 3 3 2 

— — — multiplications and — — — I — additions 

Note Assume that no row interchanges are required. 

Consider the variation of Gauss-Jordan elimination in which zeros are introduced above and below a leading 1 as soon as it is 
11- obtained, and let A be an invertible HX « matrix. Show that to solve a linear system Ax = h using this version of Gauss-Jordan 
elimination requires 

3 2 3 

^■r- 4- ^r- multiplications and ^ — ^ additions 

Note Assume that no row interchanges are required. 

(For Readers Who Have Studied Calculus) Show that if p( K ) — ^ + a ^ x + ... 4. a^* where ak ± o> then 
12. J 

lim ^ = 1 
x _» +00 akX k 

This result justifies the approximation flQ _|_ a ^ x _|_ ... _|_ a ^ ~ aj^r* for large values of x. 



13. 

(a) Why is y = ( (AB ) C)x an even less efficient way to find y in Example 1 ? 



(b) Use the result of Exercise 1 to find the operation count for this approach and for y = A(B (Cx) ) . 
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9.9 

/^-DECOMPOSITIONS 



With Gaussian elimination and Gauss-Jordan elimination, a linear system is 
solved by operating systematically on an augmented matrix. In this section 
we shall discuss a different organization of this approach, one based on 
factoring the coefficient matrix into a product of lower and upper triangular 
matrices. This method is well suited for computers and is the basis for many 
practical computer programs. * 



Solving Linear Systems by Factoring 

We shall proceed in two stages. First, we shall show how a linear system Ax = h can be solved very easily once the 
coefficient matrix A is factored into a product of lower and upper triangular matrices. Second, we shall show how to 
construct such factorizations. 

If an ^ x w matrix A can be factored into a product of ^ x n matrices as 

A = LU 
where L is lower triangular and U is upper triangular, then the linear system j\x = h can be solved as follows: 

Step 1. Rewrite the system ^ = b as 



LUx = h 



(1) 



Step 2. Define a new ^ x 1 matrix y by 



Ux=y 



(2) 



Step 3. Use 2 to rewrite 1 as Ly = h and solve this system for y. 



Step 4. Substitute y in 2 and solve for x. 



Although this procedure replaces the problem of solving the single system ^4 X = h by the problem of solving the two systems 
Ly = h and Ux = y, the latter systems are easy to solve because the coefficient matrices are triangular. The following 
example illustrates this procedure. 



EXAMPLE 1 Solving a System by Factorization 



Later in this section we will derive the factorization 

2 6 2 

-3 -3 

4 9 2 



2 

-3 1 

4 -3 



Use this result and the method described above to solve the system 
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(3) 



Solution 



Rewrite 3 as 
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(4) 



As specified in Step 2 above, define y i, y% and 73 by the equation 



"1 3 r 


_ *i" 




>r 


1 3 


^2 
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72 
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^3 




73 



(5) 



so 4 can be rewritten as 
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~71~ 
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73 




3 



or, equivalently, 



271 =2 

-371+72 =2 

471-372 I 773 = 3 

The procedure for solving this system is similar to back-substitution except that the equations are solved from the top down 
instead of from the bottom up. This procedure, which is called forward-substitution, yields 

71 = ^ 72 = 5, 73 = 2 

(verify). Substituting these values in 5 yields the linear system 
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1 3 


*2 
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1 


*3 




2 



or, equivalently, 

*1 + 3*2 +*3 = 1 
*2 + 3^3 = 5 
*3 = 2 
Solving this system by back-substitution yields the solution 

^1 = 2 ? x 2 = - 1, *3 = 2 
(verify). 

L (-/-Decompositions 

Now that we have seen how a linear system of n equations in n unknowns can be solved by factoring the coefficient matrix, 
we shall turn to the problem of constructing such factorizations. To motivate the method, suppose that an n x n matrix A has 
been reduced to a row-echelon form [/by Gaussian elimination — that is, by a certain sequence of elementary row operations. 
By Theorem 1.5.1 each of these operations can be accomplished by multiplying on the left by an appropriate elementary 
matrix. Thus there are elementary matrices E\, 52' ■ ■ •> ^k suc ^ ^ at 



E k -.E 2 E l A=U 



(6) 



By Theorem 1.5.2, E\, E^* •••■> Ek are invertible, so we can multiply both sides of Equation 6 on the left successively by 



to obtain 



E k ,...,E 2 ,E X 



A = E^E^-E^U 



(7) 



In Exercise 15 we will help the reader to show that the matrix L defined by 



(8) 



is lower triangular provided that no row interchanges are used in reducing A to U. Assuming this to be so, substituting 8 into 
7 yields 

A = LU 
which is a factorization of A into a product of a lower triangular matrix and an upper triangular matrix. 

The following theorem summarizes the above result. 
THEOREM 9.9.1 



If A is a square matrix that can be reduced to a row-echelon form U by Gaussian elimination without row interchanges, 
then A can be factored as A = LU> where L is a lower triangular matrix. 



DEFINITION 



A factorization of a square matrix A as ^4 = £ jj, where L is lower triangular and U is upper triangular, is called an £ U 
-decomposition or triangular decomposition of the matrix A. 



EXAMPLE 2 An LU-Decomposition 



Find an £ ^/-decomposition of 



,4 = 



2 


6 2" 


-3 


-8 


4 


9 2 



Solution 
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To obtain an £ ^/-decomposition, ^ — £ £/, we shall reduce A to a row-echelon form U using Gaussian elimination and then 
calculate L from 8. The steps are as follows: Thus 



and, from 8, 



L = 



"2 





0] 





1 











lj 
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0" 


-3 


1 


4 


-3 7 



SO 



U = 
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1 



1 0" 


"1 0] 
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-8 
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9 2 





is an ^-decomposition of A. 
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Procedure for Finding LU-Decompositions 

As this example shows, most of the work in constructing an L ^/-decomposition is expended in the calculation of L. However, 
all this work can be eliminated by some careful bookkeeping of the operations used to reduce A to U. Because we are 
assuming that no row interchanges are required to reduce A to U, there are only two types of operations involved: multiplying 
a row by a nonzero constant, and adding a multiple of one row to another. The first operation is used to introduce the leading 
l's and the second to introduce zeros below the leading l's. 

In Example 2, the multipliers needed to introduce the leading l's in successive rows were as follows: 

— for the first row 

1 for the second row 

— for the third row 

Note that in 9, the successive diagonal entries in L were precisely the reciprocals of these multipliers: 

(D o a" 



L = 



-3 (D 

4 -3 







(9) 



Next, observe that to introduce zeros below the leading 1 in the first row, we used the operations 

add 3 times the first row to the second 

add — 4 times the first row to the third 
and to introduce the zero below the leading 1 in the second row, we used the operation 

add 3 times the second row to the third 
Now note in 10 that in each position below the main diagonal of L, the entry is the negative of the multiplier in the operation 
that introduced the zero in that position in U: 



L = 



2 {) 

-3 I Q 

4 -3 7 



(10) 



We state without proof that the same happens in the general case. Therefore, we have the following procedure for 
constructing an £ ^/-decomposition of a square matrix A provided that A can be reduced to row-echelon form without row 
interchanges. 



Step 1. Reduce A to a row-echelon form U by Gaussian elimination without row interchanges, keeping track of the 
multipliers used to introduce the leading l's and the multipliers used to introduce the zeros below the leading l's. 

Step 2. In each position along the main diagonal of L, place the reciprocal of the multiplier that introduced the leading 1 
in that position in U. 

Step 3. In each position below the main diagonal of L, place the negative of the multiplier used to introduce the zero in 
that position in U. 



EXAMPLE 3 Finding an /.^-Decomposition 



Find an I U -decomposition of 



L = 



-2 

-1 1 
7 5 



Solution 
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multiplier = — 9 
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We begin by reducing A to row-echelon form, keeping track of all multipliers. Constructing L from the multipliers yields the 



L ^/-decomposition. 



A = LU = 



"6 





0] 
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2 
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l 
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2 





1 



We conclude this section by briefly discussing two fundamental questions about £ ^/-decompositions: 



1. Does every square matrix have an £ ^/-decomposition? 



2. Can a square matrix have more than one £ ^/-decomposition? 



We already know that if a square matrix A can be reduced to row-echelon form by Gaussian elimination without row 
interchanges, then A has an £ ^/-decomposition. In general, if row interchanges are required to reduce matrix A to 
row-echelon form, then there is no £ ^/-decomposition of A. However, in such cases it is possible to factor A in the form of a 
PL U-decomposition 

A = PLU 
where L is lower triangular, U is upper triangular, and Pisa matrix obtained by interchanging the rows of / appropriately 
(see Exercise 17). Any matrix that is equal to the identity matrix with the order of its rows changed is called a permutation 
matrix. 

In the absence of additional restrictions, £ ^-decompositions are not unique. For example, if 



A = LU = 



1 un "13 
1 u 2 z 
1 



in 

^21 ^22 

h\ hi hi 

and L has nonzero diagonal entries, then we can shift the diagonal entries from the left factor to the right factor by writing 

"ill 

i 2 2 




A = 
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o o" 


l2L 

hi 


1 


hi 


hi 



h 2 



1 u\2 "13 



1 




"23 
1 
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o ol 


l2L 
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hi 




hL 


In i 


hi 


hi 



in h\un h\u\2 

I22 ^22"23 
i 33 



which is another triangular decomposition of A. 



Exercise Set 9.9 



© 



Click here for Just Ask! 



Use the method of Example 1 and the L LV-decomposition 
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to solve the system 



2. 





3x\ — 6x2 — 






— 2t:i I 5*2 = 1 




>e the method of Example 1 and the £ [/-d< 
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solve the system 




3*i — 6x2 — 3*3 = — 3 




2*i 4- 6*3 = — 22 




-Ax 


1 + 


7*2 1 4*3 = 


3 





-1 

2 
1 



In Exercises 3-10 find an £ ^-decomposition of the coefficient matrix; then use the method of Example 1 to solve the 
system. 
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Let 



11. 



,4 = 



2 1-1 

-2-1 2 

2 1 



12. 



(a) Find an £ ^-decomposition of A. 



(b) Express A in the form A = L\DU\, where £ 1 is lower triangular with l's along the main diagonal, ;y 1 is upper 
triangular, and D is a diagonal matrix. 

(c) Express A in the form A — 1 2 ^2' w h ere £2 * s l° wer triangular with l's along the main diagonal and £/ 2 is upper 
triangular. 



(a) Show that the matrix 



1 

1 

has no itz-decomposition. 



(b) Find a PL ^-decomposition of this matrix. 



Let 



13. 



,4 = 



a b 
c d 



(a) Prove: If a it 0, then A has a unique £ ^-decomposition with l's along the main diagonal of L. 



(b) Find the £ ^/-decomposition described in part (a). 



Let Ax = h be a linear system of n equations in n unknowns, and assume that A is an invertible matrix that can be 

14. reduced to row-echelon form without row interchanges. How many additions and multiplications are required to solve 
the system by the method of Example 1 ? 

Note Count subtractions as additions and divisions as multiplications. 

Recall from Theorem IJ.lb that a product of lower triangular matrices is lower triangular. Use this fact to prove that the 

15. matrix L in 8 is lower triangular. 



16. 



Use the result in Exercise 15 to prove that a product of finitely many upper triangular matrices is upper triangular. 



Prove: If A is any nxn matrix, then A can be factored as A = PL U, where L is lower triangular, U is upper triangular, 
17. and P can be obtained by interchanging the rows of / appropriately. 

Hint Let [/be a row-echelon form of A, and let all row interchanges required in the reduction of A to [/be performed 
first. 



Factor 



18. 



,4 = 



3 


-1 0" 


3 


-1 1 





2 1 



as A = PL U, where Pisa permutation matrix obtained from / 3 by interchanging rows appropriately, L is lower 
triangular, and U is upper triangular. 

Show that if j4 = PLU, then Ax = h m ay be solved by a two-step process similar to the process in Example 1. Use this 
19. method to solve Ax = b> where A is the matrix in Exercise 18 and I, — e2 . 
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Chapter 9 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, Mathematica, 
Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear 
algebra capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The 
goal of these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the 
techniques in these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise 
sets. 



Section 9.1 



Tl. 



T2. 



(a) Find a general solution of the system 

y[ = 3^1 + 2x2+2^3 

72= 7l+4y 2 + 73 
y f 3 = -2yi-4y 2 - 73 

by computing appropriate eigenvalues and eigenvectors. 

(b) Find the solution that satisfies the initial conditions y 1 (0) = 0. ^2(0) = 1» 73(0) = — 3- 



The electrical circuit in the accompanying figure, called a parallel LRC circuit, contains a resistor with resistance R ohms 
(Q), an inductor with inductance L henries (H), and a capacitor with capacitance C farads (F). It is shown in electrical circuit 
theory that the current / in amperes (A) through the inductor and the voltage drop V in volts (V) across the capacitor satisfy 
the system of differential equations 

dl = V_ 
di L 

dV = IV 
di " C RC 

where the derivatives are with respect to the time t . Find / and V as functions offif£ = Q.5ILC = Q.2F>,R = 2flb and the 
initial values of V and / are V(0) = 1 V and /(Q) = 2 A- 

C 

• — it — 



>-* Wr 



L 

-nmn- 



Figure Ex-T2 



Section 9.3 



Tl. (Least Squares Straight Line Fit) 



Read your documentation on finding the least squares straight line fit to a set of data points, and then use your utility to find 
the line of best fit to the data in Example 1 . Do not imitate the method in the example; rather, use the command provided by 
your utility. 

T2. (Least Squares Polynomial Fit) 

Read your documentation on fitting polynomials to a set of data points by least squares, and then use your utility to find the 
polynomial fit to the data in Example 3. Do not imitate the method in the example; rather, use the command provided by your 
utility. 

Section 9.7 

Tl. (Quadric Surfaces) 

Use your technology utility to perform the computations in Example 3. 

Section 9.9 

Tl. (^/-decomposition) 

Technology utilities vary widely in how they handle £jj- and ^^/-decompositions. For example, most programs perform 
row interchanges to reduce roundoff error and hence produce a PL ^/-decomposition, even when asked for an £ JJ 
-decomposition. Ready our documentation, and then see what happens when you use your utility to find an ££/ 
-decomposition of the matrix A in Example 3. 
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10 



CHAPTER 



Complex Vector Spaces 



INTRODUCTION: Upto now we have considered only vector spaces for which the scalars are real numbers. However, 

for many important applications of vectors, it is desirable to allow the scalars to be complex numbers. For example, in 
problems involving systems of differential equations, complex eigenvalues are often the case of greatest interest. 

In the first three sections of this chapter we will review some of the basic properties of complex numbers, and in subsequent 
sections we will discuss vector spaces in which scalars can be complex numbers. 
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10.1 

COMPLEX NUMBERS 



In this section we shall review the definition of a complex number and discuss 
the addition, subtraction, and multiplication of such numbers. We will also 
consider matrices with complex entries and explain how addition and 
subtraction of complex numbers can be viewed as operations on vectors. 



Complex Numbers 

Since x 2 > for every real number x, the equation ^2 _ _ | has no real solutions. To deal with this problem, mathematicians of 
the eighteenth century introduced the "imaginary" number, 

which they assumed had the property 

^ = ( ^T) 2 = _i 

but which otherwise could be treated like an ordinary number. Expressions of the form 

a + bi qx 

where a and b are real numbers, were called "complex numbers," and these were manipulated according to the standard rules of 
arithmetic with the added property that j 2 — _ ] . 

By the beginning of the nineteenth century it was recognized that a complex number 1 could be regarded as an alternative symbol 
for the ordered pair 

(a,b) 

of real numbers, and that operations of addition, subtraction, multiplication, and division could be defined on these ordered pairs 
so that the familiar laws of arithmetic hold and jl — _ ]. This is the approach we will follow. 



DEFINITION 



A complex number is an ordered pair of real numbers, denoted either by (a, b) or by a | hi, where j 2 = _ ]. 



EXAMPLE 1 Two Notations for a Complex Number 



Some examples of complex numbers in both notations are as follows 



Ordered Pair Equivalent Notation 



(3,4) 

(-1,2) 

(0,1) 



-l + 2i 
+ ; 



Ordered Pair 


Equivalent Notation 


(2,0) 
(4,-2) 


2 + 0i 
4+(-2)i 



For simplicity, the last three complex numbers would usually be abbreviated as 

+ i = i, 2 + 0z = 2, 4 I ( — 2)i = 4 — 2i 

Geometrically, a complex number can be viewed as either a point or a vector in the ;ry-plane (Figure 10.1.1). 



a t }',• 




In) Complex number as a 
point 




ihj Com pics number as a 
vector 

Figure 10.1.1 



EXAMPLE 2 Complex Numbers as Points and as Vectors 



Some complex numbers are shown as points in Figure 10.1.2a and as vectors in Figure 10.1.2&. 
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Figure 10.1.2 



The Complex Plane 

Sometimes it is convenient to use a single letter, such as z, to denote a complex number. Thus we might write 

z = a I bi 

The real number a is called the real part ofz, and the real number b is called the imaginary part ofz. These numbers are denoted 
by Re (z) and Jm ( z ), respectively. Thus 

Re (4-30=4 and Jm(A-3i)= -3 



When complex numbers are represented geometrically in an ^-coordinate system, the x-axis is called the real axis, the j-axis is 
called the imaginary axis, and the plane is called the complex plane (Figure 10.1.3). The resulting plot is called anArgand 
diagram. 
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Figure 10.1.3 
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Argand diagram. 



Operations on Complex Numbers 

Just as two vectors in p 2 are defined to be equal if they have the same components, so we define two complex numbers to be 
equal if their real parts are equal and their imaginary parts are equal: 



DEFINITION 



Two complex numbers, a I bi and c + di, are defined to be equal, written 

a -\- hi = r -\- di 
if a = c^db = d- 



If b = 0? then the complex number a I bi reduces to a I 0i» which we write simply as a. Thus, for any real number a, 

a = a -\- Oi 
so the real numbers can be regarded as complex numbers with an imaginary part of zero. Geometrically, the real numbers 
correspond to points on the real axis. If we have a = Q> then a I bi reduces to I bU which we usually write as bi- These complex 
numbers, which correspond to points on the imaginary axis, are called pure imaginary numbers. 

Just as vectors in R 2 are added by adding corresponding components, so complex numbers are added by adding their real parts 
and adding their imaginary parts: 

(a + bi) + (c + di) = (a + c) + (b + d)i ^) 

The operations of subtraction and multiplication by a real number are also similar to the corresponding vector operations in p 2 : 

(a 4- bi) — (c f di) = (a — c) + (b — d)i /o\ 



k{a +bi) = (ka) + (kb)i 7 k real 



(4) 



Because the operations of addition, subtraction, and multiplication of a complex number by a real number parallel the 
corresponding operations for vectors in ^ 2 , the familiar geometric interpretations of these operations hold for complex numbers 
(see Figure 10.1.4). 




(a) The *um of two 
comptex numbers 




{h\ The difference of l\vo 
complex numbers 




(c) The product of a complex 
number ™ and a positive 
real number it 




[d) The product of a complex 
number z and a negative 
real number it 

Figure 10.1.4 



It follows from 4 that ( — l)z I z = (verify), so we denote ( — l)z as — z and call it the negative ofz. 



EXAMPLE 3 Adding, Subtracting, and Multiplying by Real Numbers 



Ifz 1= 4-5iandz 2 = _i | &,find Zl | z^l -z 2 , 3z v and -z 2 . 

So/at/or? 

Zl+Z2 = (4-50 K-l + 60 = C4-l) + (-5 + 6)i = 3+i 
Zl -z 2 = (4 - 50 - ( - 1 + 60 = (4 + 1) 4 (-5-6)i = 5-\ h 
3z { = 3(4-5i) = 12-15i 
-z 2 = (-l)z 2 = (-l)(-l I 60 = 1-6! 

So far, there has been a parallel between complex numbers and vectors in p 2 . However, we now define multiplication of complex 
numbers, an operation with no vector analog in j? 2 . To motivate the definition, we expand the product 

(a I bi)(c I di) 
following the usual rules of algebra but treating j 2 as -1. This yields 

(a + bi) (c + di) = ac + bdi + adi + bci 
= (ac — bd) f (ad + bc)i 
which suggests the following definition: 

(a + bi) (c + di) = (ac — bd) + (<3<s? 4- bc)i /<r\ 



EXAMPLE 4 Multiplying Complex Numbers 



(3 + 20(4 + 50 = (3-4-2-5) I (3 - 5 I 2-4)! 
= 2 + 23! 
( 4 -0(2-30 = [4-2-(-l)(-3)] I [(4)(-3) I (-l)(2)]i 
= 5-14! 

j 2 =(0 + !)(0 + !) = (0-0-l-l) I (0-1 I l-0)j=-l 



We leave it as an exercise to verify the following rules of complex arithmetic: 

zi + z 2 =z 2 +zi 
z\Z2=z 2 z\ 
zi + (z 2 I z 3 ) = (z { I z 2 )+z 3 
zi(z 2 z 3 ) = (ziZ2)z 3 
zi(z 2 +z 3 ) =ziz 2 +ziz 3 
0+z=z 
z+(^z)=0 
1 -z = z 



These rules make it possible to multiply complex numbers without using Formula 5 directly. Following the procedure used to 
motivate this formula, we can simply multiply each term of a I bi by each term of c I di 9 set^ 2 — _], and simplify. 



EXAMPLE 5 Multiplication of Complex Numbers 



(3 + a) (4 + = 12 + 3i + Si + 2r = 12 H 1 li - 2 = 10 + 1 li 
5 - 1 i\(2 I 30 = 10 -h 15i - i - | i 2 = 10 + 14^-h | = ^ + 14; 

i(l I i)(\-2i)=i(\-2i | i-2i 2 )=i(3-i) = 3i-i 2 = l + 3i 



Remark Unlike the real numbers, there is no size ordering for the complex numbers. Thus, the order symbols <, <, >, and > are 
not used with complex numbers. 

Now that we have defined addition, subtraction, and multiplication of complex numbers, it is possible to add, subtract, and 
multiply matrices with complex entries and to multiply a matrix by a complex number. Without going into detail, we note that the 
matrix operations and terminology discussed in Chapter 1 carry over without change to matrices with complex entries. 



EXAMPLE 6 Matrices with Complex Entries 



If 



then 
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1 +i 4-i 
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2-3z 4 
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1 -T" 
1 + i 4-i 

l-i+(-i)-(2-3i) 1-0—0 l (-0-4 

(l+i)-i+(4-i)-(2-3i) (1 hi)- (1-0 I (4-0-4 

-3-i 1-5J 

4_13i 18 -Ai 



Exercise Set 10.1 



O 



Click here for Just Ask! 



In each part, plot the point and sketch the vector that corresponds to the given complex number. 
1. 



(a) 2 + 3i 

(b) -4 

(c) -3-2j 

(d) -5i 

Express each complex number in Exercise 1 as an ordered pair of real numbers. 

2. 



In each part, use the given information to find the real numbers x and v. 
3. 



(a) x - iy = - 2 I 3i 

(b) (x \ y) \ (x-y)i = 3 + i 

Given that Z{ = \ _ 2i and Z2 = 4 4. % find 
4. 

(a) z\ +z 2 

(b) z\ -22 

(c) 4zi 

(d) -z 2 

(e) 3zi I Az 2 



In each part, solve for z- 
5. 



6. 



(a) z I (1-0 = 3 + 2! 

(b) -5z = 5 + \0i 

(c) (i-z) I (2z-3i)= -2 + li 

In each part, sketch the vectors z\,Z2,z\ \ z 2 , anc * z l — z 2- 

(a) z 1 = 3 + !>z 2 = l+4J 

(b) Zl = _ 2 | 2h z 2 = 4 + 5i 

In each part, sketch the vectors z and kz- 

(a) z =1 + i,£ = 2 

(b) z = -3-4i,k= -2 

(c) z = 4 + 6i,i=-| 



In each part, find real numbers £h and jk that satisfy the equation. 
8. 



7. 



(a) k\i I ^(1 I = 3-2; 

(b) ki(2 I 30 I *a(l -4i) =7 + 5i 



In each part, find ziz->, _2, and ,21. 

9 ^1 e2_ 



(a) zj = 3i, z 2 = 1 — i 



(b) Zl = 4 + 6j, Z2 = 2 - 3! 



10. 



(C) zi = j(2 I 40^2 = ^(1-50 



Given that Z{ = 2 - 5i and Z2 = - 1 - h find 



(a) Z1-Z1Z2 



(b) (zi \-3zj) 2 

(c) [z! I (1 I z 2 )] 2 



(d) -2- 2 



In Exercises 1 1-18 perform the calculations and express the result in the form a I bi- 



ll. 



(1 + 20(4-60' 



12. 



(2-0(3 + 0(4-20 



13. 



(l-30 : 



14. 



j(l-l 70-3i(4 + 20 



15. 



16. 



(2M)(i||.) 



C^2 + 0-i|/2(l + |/20 



17. 



(1+!+!^ ! 3 ) 



100 



18. 



(3-20 2 -(3 I 20 2 



19. 



Let 
Find 



1 i 
-i 3 



B = 



2 2 + i 
3-i 4 



(a) A+3iB 

(b) BA 

(c) AB 

(d) B 2 -A 2 



20. 



Let 

A = 

Find 



3 I 2s 

-i 2 

1 +! 1 -J 



5 = 



-i 2 
i 



21. 



(a) ,4(5C) 

(b) (5CM 

(c) (C^)^ 2 

(d) (1 | i)(AB) I (3-4iM 

Show that 

(a) Im (iz) = Re (z) 

(b) Re (iz) = - Im (z) 



C = 



-1-j -i 
3 2-5 



In each part, solve the equation by the quadratic formula and check your results by substituting the solutions into the given 
22. equation. 



(a) z 2 + 2z + 2 = 

(b) z 2 -z+\ = 



23. 



(a) Show that if n is a positive integer, then the only possible values for j" are 1, -1, i, and 



(b) Find, 2509. 



24. 



Prove: If Zl z 2 = 0, then Z{ = o or Z2 = 0- 



25. 



Use the result of Exercise 24 to prove: If zzi = zz 2 and z * 0, then z\ = z 2 . 



26. 



Prove that for all complex numbers zj, z 2 , an ^ z 3> 



(a) z\ I z 2 =z 2 +z\ 



(b) zi + (z 2 + z 3 ) = (zi -\-z 2 ) + z 3 



27. 



Prove that for all complex numbers z\, z 2 , and z 3 , 



(a) ^1^2 =^2^1 



(b) zi(z2Z 3 ) = (ziz 2 )z 3 



28. 



Prove that Z{ ( Z2 \ z 3 ) =z\z 2 I ziz 3 for all complex numbers z\, z 2 , and z 3 . 



29. 



In quantum mechanics the Dirac matrices are 



/?= 



Q 7 = 



1 

1 














-1 





i 

-i 





a r = 



Oty = 






1 







1 







1 


? 


1 










1 











-1 


1 











-1 






(a) Prove that $ 2 = a 2 x = a} = c? z = I 



(b) Two matrices A and B are called anticommutative if AB — — BA- Prove that any two distinct Dirac matrices are 
anticommutative. 



30. 



Describe the set of all complex numbers z = a + bi such that a 2 4- b 2 = 1 • Show that ifz\,Z2 are such numbers, then so is 



z\z 2 - 
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10.2 In the last section we defined multiplication of complex numbers. In this 

DIVISION OF COM P LEX section we shal1 define division of complex numbers as the inverse of 

NUMBERS 



multiplication. 



We begin with some preliminary ideas. 



Complex Conjugates 

If z = a + bi is any complex number, then the complex conjugate of z (also called the conjugate of z) is denoted by the 
symbol z (read " z bar" or " z conjugate") and is defined by 

z = a — bi 

In words, z is obtained by reversing the sign of the imaginary part of z. Geometrically, z is the reflection of z about the real 
axis (Figure 10.2.1). 




Figure 10.2.1 



In', ■'" 



The conjugate of a complex number. 



EXAMPLE 1 Examples of Conjugates 



z = 3 + 2i z = 3 — 2i 

z= -4-2i z= -4 I 2; 

z = i z= —i 

z=4 z=4 



Remark The last line in Example 1 illustrates the fact that a real number is the same as its conjugate. More precisely, it can 
be shown (Exercise 22) that z = z if and only if z is a real number. 



If a complex number z is viewed as a vector in j? 2 , then the norm or length of the vector is called the modulus of z. More 
precisely: 



DEFINITION 




1 hi, denoted by |z|, is defined by 


(1) 


The modulus of a complex number z 


= a - 




\z\=\ja 2 + b 2 



If b = 0, then z = a is a real number, and 

fi\ = \ja 2 + 2 = \[a 2 = \a\ 
so the modulus of a real number is simply its absolute value. Thus the modulus of z is also called the absolute value of z. 




Paul Adrien Maurice Dirac (1902-1984) was a British theoretical physicist who devised a new form of quantum 
mechanics and a theory that predicted electron spin and the existence of a fundamental atomic particle called a positron. 
He received the Nobel Prize for physics in 1933 and the medal of the Royal Society in 1939. 



EXAMPLE 2 Modulus of a Complex Number 



Find |^|if z = 3-4;. 

Solution 

Froml, witha = 3and£= _4, (z| = j/(3) 2 | (_4) 2 = ^25 = 5- 
The following theorem establishes a basic relationship between z and |z|. 



THEOREM 10.2.1 



For any complex number z, 



&=\z\ 2 



Proof If z = a+hi, then 



zz = (a 4- iz) (a —bi) =a — a£z I iaz -i i = a -hi = |z| 



Division of Complex Numbers 

We now turn to the division of complex numbers. Our objective is to define division as the inverse of multiplication. Thus, 
if ?2 * 0' ^ ien our definition of z = Zl / Z2 should be such that 



z\ =z- A z 



(2) 



Our procedure will be to prove that 2 has a unique solution for z if z 2 * 0> an d then to define Zl / Z2 to be this value of z. As 
with real numbers, division by zero is not allowed. 



THEOREM 10.2.2 




Proof Let z = x + iy,zi=xi+iyi, and z 2 = jr 2 + *72- Tnen 2 can be written as 

*1+J7l = (*2 4 J72)(* + J7) 

or 

x\ +iy\ = (x2x-yzy) I J02* + *27) 
or, on equating real and imaginary parts, 

x2*-y2y = x\ 

72* + *27=7l 

or 



*2 -72 
72 *2 



*1 
71 



Since z 2 = x 2 4- zj>2 * 0/ jt follows that *2 ar| d 72 are not both zero, so 

*2 -72 
72 *2 



= x 2 +.y 2 2 *0 



Thus, by Cramer's rule (Theorem 2.1.4), system 4 has the unique solution 



(4) 



Therefore, 



*i -yi 
y\ *2 



X = 



7 = 



X 


2 


-72 


72 


*2 




*2 


*1 




72 


71 


*2 


-72 


72 


*2 



*1*2+7172 

2 , 2 
x 2 | j^ 2 


. *1*2 1 7172 

Y2\ 2 


71*2 -*172 

2 . 2 
*2^ 7 2 


71^2-^172 
hi' 



z = x + iy = ■ 



1 



ft I' 

1 



■[(*1*2 I 7172) I *Ol*2-*l72)] 



1 



-(xi + Jyi)(*2-*72) = 7*1*2 



Thus, for ?2 ,t q, we define 



5L = -L 
Z2 fcf 



"*1*2 



(5) 



Remark To remember this formula, multiply the numerator and denominator of Zl / Z2 by z 2 : 

£j £l£2._ _Ell2_ — L 



Z2 Z2Z2 m 2 tar 



■^ 1^2 



EXAMPLE 3 Quotient in the Form a - bi 



Express 



in the form a \ bi- 



Solution 



3 + 4; 
1 — 2i 



From 5 with Z] _ = 3 + 4j and Z2 = 1 _ 2j, 

3 + 4; 
1-2; 



|1-2;| 



2 (3 I 4;)(l-2;) = ^(3 I 4;)(1 i 2;) 



= t(-5 I 10;)= -1 + 2; 



Alternative Solution 

As in the remark above, multiply numerator and denominator by the conjugate of the denominator: 

3 + 4; 3 + 4; 1 I 2; -5 + 10; _ , . ? , 
1 - 2; ' ' 1 - 2; " 1 I 2; ' 5 



Systems of linear equations with complex coefficients arise in various applications. Without going into detail, we note that 
all the results about linear systems studied in Chapters 1 and 2 carry over without change to systems with complex 
coefficients. Note, however, that a few results studied in other chapters will change for complex matrices. 



EXAMPLE 4 A Linear System with Complex Coefficients 



Use Cramer's rule to solve 



ix + 2y = 1 — 2i 
4x - iy = - 1 + 3i 



Solution 



X = 



y = 




(-0(1 -2Q -2(- 1 I 3Q -7; =i 
.(-0-2(4) -7 ■ 



(Q(-l I 3Q-4(1-2Q 
,(-0-2(4) 



-7 + 7; 

-7 



= l-i 



Thus the solution is x = i, y = 1 — i- 



We conclude this section by listing some properties of the complex conjugate that will be useful in later sections. 



THEOREM 10.2.3 



Properties of the Conjugate 

For any complex numbers z, z\, and zj- 

(a) T\ I z 2 =z\ -\-z 2 



(b) z\ -z 2 =zi-z 2 



(c) ziz 2 =ziz 2 

( d ) (zi lz 2 ) =z\ iz 2 



(e) ?=z 



We prove (a) and leave the rest as exercises. 



Proof (a) Let Zl=ai+ t l i and z 2 = a 2 + b$, then 



zi + z 2 = Oi 4- fl 2 ) + C*l + ^2)^ 

= (si -iiO i («2 — *20 

= zi+z 2 



Remark It is possible to extend part (a) of Theorem 10.2.3 to n terms and part (c) to n factors. More precisely, 



zi + z 2 H hz H =zi + z 2 H hz H 



ziZ2-z H =ziz 2 -z H 



Exercise Set 10.2 



Click here for Just Ask! 



In each part, find z. 
1. 



(a) z=2 + 7i 

(b) z = - 3 - 5i 

(c) ?=5* 

(d) z= -i 

(e) *=-9 

(f) z=0 



In each part, find |z|. 
2. 



(a) z = i 

(b) z = -li 

(c) z= -3-Ai 

(d) z=l+i 

(e) z=-8 

(f) z=0 

Verify that ^— u^l for 

(a) z = 2-4i 

(b) z = - 3 + 5; 

(c) z=$2-{li 

Given that z , = ] _ 5; and z? = 3 _| 4j, find 
4. 

(a) zi /z 2 

(b) zj /z 2 

(c) zi/z 2 

(d) (zi/z 2 ) 

(e) z!/|z 2 | 

(f) |zi/z 2 | 



In each part, find \ j Z - 

5. 



(a) z = i 

(b) z = 1 - S 



W z = ^ 



Given that Zl = 1 | j and Z2 = ] _ 2i, find 
6. 



(a) zi 



"£) 



(b) Z1 -1 

Z2 



<C> ^-(t 



(d) z±_ 

In Exercises 7-14 perform the calculations and express the result in the form g±hi. 
i 



7. 


1 + 


i 
2 




8. 


(1- 


-0^ 

1 





9. 


(3 


f4i) 2 

2+! 





10. J(-3 I 4i) 

l/3 I i 
lh (l-0(/3-0 

1 
12. i(3-2i)(l+0 



13. (1-0(1-20(1 I 20 

1-2; 2 i i 

14. 3 | 4i 5i 

In each part, solve for z. 
15. 

(a) iz = 2-i 

(b) (4-3i)z = i 

Use Theorem 10.2.3 to prove the following identities: 
16. 



(a) z + 5i=z — 5i 

(b) iz— —iz 



(c) i +z _ _ . 



j — z 



In each part, sketch the set of points in the complex plane that satisfies the equation. 
17. 



(a) N = 2 

(b) |z-(l l-0| = l 

(c) |z-i| = |z + i| 

(d) Im(J+0 = 3 

In each part, sketch the set of points in the complex plane that satisfies the given condition(s). 
18. 



(a) ^ + *|<1 

(b) l<|z|<2 



(c) \2z-4i\<l 

(d ) M<|z + i| 

Given that z = x + iy, find 
19. 

(a) Re (iz) 

(b) Im(iz) 

(c) Re (iz) 

(d) Im(iz) 



20. 

(a) Show that if n is a positive integer, then the only possible values for (1 / i) " are 1, -1, /, and _ j. 



(b) Find (I/O 



2509 



Hint See Exercise 23(b) of Section 10.1. 



Prove: 
21. 



( a ) ^(z I z) = Re (z) 



(b) ±(z-z)= Jm( z ) 



Prove: z — z if and only if z is a real number. 
22. 



Given that Z \ = x\ + iy i and Z2 = X2 + iy 2 * Q, find 
23. 



Prove: If (z) =z $ /hen z is either real < 



or pure imaginary. 
24. 



Prove* th&S |zf = |z|. 
25. 



Prove: 



26. 



(a) zi -z 2 =zi-z 2 



(b) z 1 z 2 =ziz 2 



( c ) (zi /z 2 ) =z\ fz 2 

(d) z=^ 



27. 



(a) Prove that z 2 = (z) 



2 /^2 



(b) (b) Prove that if n is a positive integer, then z" = (z) 



(c) Is the result in part (b) true if n is a negative integer? Explain. 

In Exercises 28-31 solve the system of linear equations by Cramer's rule. 

ix\ — ix 2 = — 2 



28. 



29. 



30. 



31. 



2*i I x 2 = i 

*1 +*2 = 2 
*1 — *2 = 2; 

*1 -\- x 2 I *3 = 3 

*1 -f *2 — x 3 = 2 + 2; 

*1 -* 2 + ^3= - 1 

ix\ + 3x 2 + (1 +i)x2 = —i 



x\ + ix 2 + 3*3 = —2i 

x\+ x 2 + *3 = 

In Exercises 32 and 33 solve the system of linear equations by Gauss-Jordan elimination. 



32. 



33. 



34. 



-1 

1+i 


-l-j 
-2 


/2_ 


= 






2 
1+i 


-1-f 
1 


*2 


= 


"0 




Solve the following system of linear equations by Gauss-Jordan elimination. 



x\ + ixj —ixi = 

— x\+ (1 — 0*2 + 2i%2 = 

2x\ 4- ( — 1 I 2i)%2 — 3ix2 = 

In each part, use the formula in Theorem 1.4.5 to compute the inverse of the matrix, and check your result by showing 
35. thatAA~ l =A~ 1 A = I- 



(a) A _ 



i -2 
1 i 



(b) 



A = 



2 i 
1 



36. 



Let p{j) = aQ + a{X 4. a2X 2 _| 1_ ayiX n be a polynomial for which the coefficients a , a\, a2> •••> a H are real - Prove 

that if z is a solution of the equation ^ (z) = 0, then so is z. 



37. 



Prove: For any complex number z, | Re (z) | < |z| and | Im (z) | < |z|. 



Prove that 



38. 



Re (z)| I \Jm(z)\ 

ft 



n 



Hint Let z = x + jy and use the fact that ( br l — M) 2 > 0- 



In each part, use the method of Example 4 in Section 1.5 to find j4 -1 , and check your result by showing that 
39 ' AA- l =A~ l A = l 



(a) 



A = 



1 


l+i 


0" 





1 


i 


— i 


1 — 2i 


2_ 



(b) 



A = 



i -i 
1-1-4; 
2 — i i 3 



40. 



Show that |z — 1 J = |z — 1 1. Discuss the geometric interpretation of the result. 



41. 

(a) If zj = AJ + ijj and ?2 = a 2 I ^2*' ^ n< ^ ^1 — z 2| an ^ interpret the result geometrically. 



(b) Use part (a) to show that the complex numbers 12, fi 1 2i> an d R 1 Ri are vertices of a right triangle. 



Use Theorem 10.2.3 to show that if the coefficients a, b, and c in a quadratic polynomial are real, then the solutions of 
42. the equation az 2 \ bz 4- c = are complex conjugates. What can you conclude if a, b, and c are complex? 
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10.3 

POLAR FORM OF A 
COMPLEX NUMBER 



In this section we shall discuss a way to represent complex numbers using 
trigonometric properties. Our work will lead to an important formula for 
powers of complex numbers and to a method for finding nth roots of complex 
numbers. 



Polar Form 

If z = x I iy is a nonzero complex number, r = |z|, and Q measures the angle from the positive real axis to the vector z, then, as 
suggested by Figure 10.3.1, 



x = rcos8 ? y = r sin 8 

so that z = x I iy can be written as 7 = rro ^ff \ i rs i n flor 

z = r(cos8 + i sin 8) 

This is called a polar form ofz. 

Argument of a Complex Number 

The angle Q is called an argument ofz and is denoted by 

8= argz 

The argument of z is not uniquely determined because we can add or subtract any multiple of 2k from o to produce another 
value of the argument. However, there is only one value of the argument in radians that satisfies 

-x<8<x 
This is called the principal argument ofz and is denoted by 

9 = Aigz 



(1) 



(2) 



i 


it 






u. 


$ 






rjr 




j)=r sin (J 






V 






X 






V= reus 


■' 







Figure 10.3.1 



EXAMPLE 1 Polar Forms 



Express the following complex numbers in polar form using their principal arguments: 



(a) z = 1 + |/3i 



(b) z= -\-l 



Solution (a) 

The value of r is 

r=M = |/l 2 4 ( ] f3) 2 = f=2 

and since % = 1 and y = J3, it follows from 1 that 

1=2 cos 9 and ^3 = 2sin# 

socos0=l/2 and sm 3 — J3 f 2- The only value of ff that satisfies these relations and meets the requirement — tt < 9 < it is 
= 7T / 3( = 60°) (see Figure 10.3.2a). Thus a polar form of z is 

z = 2 [cos — + i sin -^ 1 



Solution (b) 

The value of r is 



'=^I = l/(-1) 2 1 (-i) 2 = ^ 

and since ^ = _ 1 , y = — 1 , it follows from 1 that 

— 1 = ^2 cos 9 and — l = ^2sin0 

so cos 6? = — 1 f J2 and sm.9= — 1 / ^2. The only value of that satisfies these relations and meets the requirement 
-7r<0<-iris0= -3tt/4(= - 135°) (Figure 10.3. 2b). Thus, a polar form of z is 



■=fi[ 



cos ^p- + i sin -^i- 



\ + &( 




I; 




<-l35° 



I -f 



(h) 

Figure 10.3.2 



Multiplication and Division Interpreted Geometrically 

We now show how polar forms can be used to give geometric interpretations of multiplication and division of complex 
numbers. Let 

zi=ri(cos8{-\-ism8\) and zi = /"2(cos02 + * sin #2) 



Multiplying, we obtain 

Z]Z7 =r]n_\(cos8] cos A? — sinfli sin 3^) + z(sinfli cos 87 + cosfli sin 3:01 
Recalling the trigonometric identities 

cos(8\ +#2) — cos^i cos^2 — sin#i sin#2 
sin(9i + 82) — sm ^1 cos ^2 + cos ^1 sm ^2 
we obtain 

2^2 = ^2 [cos (0i + 8 2 ) +i$m(8i +8 2 )] 

which is a polar form of the complex number with modulus r\r2 and argument 0j + 2 * Thus we have shown that 

|ziZ2| = |zi||z2| 

and 

arg(ziZ2) = argzi + argZ2 

(Why?) In words, the product of two complex numbers is obtained by multiplying their moduli and adding their arguments 
(Figure 10.3.3). 



(3) 



(4) 



We leave it as an exercise to show that if Z2 # 0> then 



^ = ^[003(^-82) +ism(8i -8 2 )] 

^A * A 



(5) 



from which it follows that 



and 



ZL 



?2 






ifz 2 *0 



arg 



£)= 



argzi - argz 2 



In words, the quotient of two complex numbers is obtained by dividing their moduli and subtracting their arguments (in the 
appropriate order). 




Figure 10.3.3 



The product of two complex numbers. 



EXAMPLE 2 A Quotient Using Polar Forms 



Let 



Polar forms of these complex numbers are 



z\ = 1 4- y3i and Z2 = y 3 4- i 



z\ = 2fcos ^r -\-i sin 



|] and z 2 = 2fcos^ + zsin^) 



(verify) so that from 3, 



and from 5, 



^="H(f + f) +ism (f + f)] 

= 4[cos| + !sin|l=4[0 + !] =Ai 
ft=l.[co S (|-|) liSm (|-|)] 



As a check, we calculate ^^2 and ?1 / Z2 directly without using polar forms for z\ and ?2 : 

ziz 2 = (l + /30(/3 I j) = (^-/3) + (3 + l)i = 4i 

g1 _ 1 + |/3i |/3-i _ (j/3 I |/3) I (-j H3Q _ J3 1 
*2 = y /3+, ' ^_, : 4 2 2 

which agrees with our previous results. 



The complex number i has a modulus of 1 and an argument of ^ f 2( = 90°), so the product ^ has the same modulus as z, but 
its argument is 90° greater than that of z. In short, multiplying z by i rotates z counterclockwise by 90° (Figure 10.3.4). 




Figure 10.3.4 



Multiplying by i rotates z counterclockwise by 90°. 



DeMoivre's Formula 

If n is a positive integer and z = r (cos 9 + i sin 9) > then from Formula 3, 

z n =z -z -*--* = Z 2 [cos(0 + H h 9) + j sin(0 + 9 + - + ff) ] 



h -factors 



h— terms 



h— terms 



or 



z n — ^ h (cqs«0 I zsinwft) 
Moreover, 6 also holds for negative integers if z * (see Exercise 23). 
In the special case where r = 1, we have z = cos 9 4- j sin ft so 6 becomes 

(cos 9 + i sin 3) H = cos k9 4- i sin w 

which is called DeMoivre's formula. Although we derived 7 assuming n to be a positive integer, it will be shown in the 
exercises that this formula is valid for all integers n. 

Finding r?th Roots 



(6) 



(7) 



We now show how DeMoivre's formula can be used to obtain roots of complex numbers. If n is a positive integer and z is any 
complex number, then we define an nth root ofz to be any complex number w that satisfies the equation 



w n =z 



(8) 



We denote an nth root of z by ^ /h . If z ^ 0, then we can derive formulas for the nth roots of z as follows. Let 

w = p(cqsqz + i sina) and z = r(cos9 + i sin0) 
If we assume that w satisfies 8, then it follows from 6 that 

p n (cos tf a 4- i sin ^a) = r(cos 9 + i sin 9) 
Comparing the moduli of the two sides, we see that p n = r or 



(9) 



where n Jr denotes the real positive nth root of r. Moreover, in order to have the equalities coswct = cosi9 and sm na . = sin 9 in 9, 
the angles ^ a and # must either be equal or differ by a multiple of 2tp That is, 



tf ct = 8 + 2hr 



or 



rt 9 2fe 
n n 



Thus the values of >^ = p(cos a. + i sin a) that satisfy 8 are given by 



W : 



cos — + z sin — + 



t = 0. ±1, ±2, 



k=D, ±1, ±2,.. 



Although there are infinitely many values of k, it can be shown (see Exercise 16) that jfc = 0, 1, 2, . . ., fl _ \ produce distinct 
values of w satisfying 8 but all other choices of k yield duplicates of these. Therefore, there are exactly n different nth roots of 
z = r(cos 9 + i sin 9) , and these are given by 



1/m « 



=^ 






£=0,1,2,...,*-! 



(10) 




Abraham DeMoivre (1667-1754) was a French mathematician who made important contributions to probability, statistics, 
and trigonometry. He developed the concept of statistically independent events, wrote a major and influential treatise on 
probability, and helped transform trigonometry from a branch of geometry into a branch of analysis through his use of 
complex numbers. In spite of his important work, he barely managed to eke out a living as a tutor and a consultant on 
gambling and insurance. 



EXAMPLE 3 Cube Roots of a Complex Number 



Find all cube roots of -8. 



Solution 



Since -8 lies on the negative real axis, we can use 9 = ^ as an argument. Moreover, j- = |z| = | — 8| = 3, so a polar form of -8 is 

— 8 = 8(cos7r + z sinx) 
From 10 with n = 3, it follows that 



(-8) 1/3 = 3 /8[co S (f | 2|L] + !S] n(| 



77 . 2hr \ 



3 / 



Thus the cube roots of -8 are 



2(costtH- i sinTr) = 2( — 1) = —2 



t=0. 1,2 



3; 



As shown in Figure 10.3.5, the three cube roots of -8 obtained in Example 3 are equally spaced ^ / 3 radians ( = 120°) apart 
around the circle of radius 2 centered at the origin. This is not accidental. In general, it follows from Formula 10 that the nth 
roots of z lie on the circle of radius n J~r( = l/~Jz[) and are equally spaced 2ir / n radians apart. (Can you see why?) Thus, once 
one nth root of z is found, the remaining fl _ \ roots can be generated by rotating this root successively through increments of 
2tt / n radians. 




Figure 10.3.5 



The cube roots of -8. 



EXAMPLE 4 Fourth Roots of a Complex Number 



Find all fourth roots of 1. 



Solution 

We could apply Formula 10. Instead, we observe that w = 1 is one fourth root of 1, so the remaining three roots can be 



generated by rotating this root through increments of 2tt / 4 = n / 2 radians ( = 90°). From Figure 10.3.6, we see that the fourth 
roots of 1 are 

1, i 9 -1, -i 



*- v 



*r A -x 



r I \ 

-^ " i — ^ 



Figure 10.3.6 



The fourth roots of 1 . 



Complex Exponents 

We conclude this section with some comments on notation. 

In more detailed studies of complex numbers, complex exponents are defined, and it is shown that 



cosd + i sm9 = e 



i& 



(11) 



where e is an irrational real number given approximately by e ~ 2.71828- • •• (For readers who have studied calculus, a proof of 
this result is given in Exercise 18.) 



It follows from 1 1 that the polar form 
can be written more briefly as 



z = r(cos8 + i sin 9) 



z = re 



J8 



(12) 



EXAMPLE 5 Expressing a Complex Number in Form 12 



In Example 1 it was shown that 



From 12 this can also be written as 



l + ^ = 2(cos^ + isin^) 
1 + ^ = 2e ivf3 



It can be proved that complex exponents follow the same laws as real exponents, so if 

i'0i j iff-? 

z\=r\€ L and Z2 = r2& A 

are nonzero complex numbers, then 



]'0i +i"0? i(0i +0?) 

5L = ZX^i-^2 = Hei&l-Bi) 

But these are just Formulas 3 and 5 in a different notation. 

We conclude this section with a useful formula for z in polar notation. If 

z = re =r(co$8 + i sm8) 
then 

z = ,r(cos0 — i sin 3) 

Recalling the trigonometric identities 

sin( — 0) = — sm.9 and cos( — 0) = cos0 
we can rewrite 13 as 

or, equivalently, 

In the special case where r — 1, the polar form of z is z = s 3 ^ an d 14 yields the formula 



(13) 



(14) 



e ie = e~ i& (15) 



Exercise Set 10.3 



Click here for Just Ask! 



In each part, find the principal argument of z- 
1. 



(a) *=1 

(b) z = i 

(c) * = -J 

(d) z=\+i 

(e) z = -1 +. fit 

(f) z=l-i 



In each part, find the value of Q = arg(l — \fii) that satisfies the given condition. 
2. 



(a) O<0<2ir 

(b) -k-<0<it 



(c) _2l<0 < II2L 
6 6 



In each part, express the complex number in polar form using its principal argument. 
3. 



(a) 2i 

(b) -4 

(c) 5 + 5i 

(d) -6 I 6^3i 

(e) -3-3; 

(f) 2^3 -2i 

Given that Zl = 2(cos ir / 4 + i sinir / 4) and Z2 = 3( CO s tt / 6 + i sin 77 / 6), find a polar form of 

(a) Z&2 



4. 



(b) £L 
^2 



(C) 2. 
*1 



(d) z 5 



z 2 



Express Z \ = i, z? = 1 — i/3~Z' an< ^ z? = i/3 I i m P°l ar form, and use your results to find ZlZ2 /z3- Check your results by 
performing the calculations without using polar forms. 



Use Formula 6 to find 

(a) (1 | j) 12 

(b) u-^r 

(c) (^ i o 7 



(d) 



Cl-i|/3) 



-10 



7. 



In each part, find all the roots and sketch them as vectors in the complex plane. 



(a) (-0 



1/2 



1/2 



10. 



( b ) (1 + ^) J 

(c) (_27) 1/3 

(d) (0 1/3 

(e) (-1) 1/4 

® (-8|8^) 1/4 
Use the method of Example 4 to find all cube roots of 1 . 
Use the method of Example 4 to find all sixth roots of 1 . 
Find all square roots of 1 + j and express your results in polar form. 



11. 



Find all solutions of the equation z 4 _ ]g — q. 



Find all solutions of the equation z 4 | 8 = and use your results to factor z 4 | 8 into two quadratic factors with real 

12, 

' coefficients. 

It was shown in the text that multiplying z by i rotates z counterclockwise by 90°. What is the geometric effect of dividing 
13. z by/? 

In each part, use 6 to calculate the given power. 
14. 

(a) (1 | if 

<& (-2^3 I 20 ~ 9 



In each part, find B_ e (z) and Im ( z ). 



(a) z = 2e™ 

(b) z = 3e~ iw 

(c) ?= ^/2 

(d) z=_ 3e -2™ 



16. 

(a) Show that the values of ? 1/m in Formula 10 are all different. 



(b) Show that integer values of k other than fc = o, 1, 2, . . ., n _ ] produce values of z 1/h that are duplicates of those in 
Formula 10. 



Show that Formula 7 is valid if n = Q or n is a negative integer. 
17. 



18. (For Readers Who Have Studied Calculus) To prove Formula 11, recall that the Maclaurin series for g* 

x 2 x n 

2! n\ 



(a) By substituting % = iO in this series and simplifying, show that 






(b) Use the result in part (a) to obtain Formula 1 1 . 



Derive Formula 5. 
19. 



When n = 2 and n = 3, Equation 7 gives 

(cos + z sin 0) = cos 20 4- i sin 29 

3 
(cos + z sin 0) = cos 39 + z sin 39 

Use these two equations to obtain trigonometric identities for CO s 2ft sin 2ft cos 3ft and sm 33. 

Use Formula 1 1 to show that 

cos 0=^^ and sin0 = ^^ 

2 2z 

Show that if (a I £z) 3 = 8, then fl 2 | b 2 = 4- 



22. 



Show that Formula 6 is valid for negative integer exponents if z ^ 0- 
23. 
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10.4 

COMPLEX VECTOR 
SPACES 



In this section we shall develop the basic properties of vector spaces with 
complex scalars and discuss some of the ways in which they differ from real 
vector spaces. However, before going farther, the reader should review the 
vector space axioms given in Section 5. 1. 



Basic Properties 

Recall that a vector space in which the scalars are allowed to be complex numbers is called a complex vector space. Linear 
combinations of vectors in a complex vector space are defined exactly as in a real vector space except that the scalars are 
allowed to be complex numbers. More precisely, a vector w is called a linear combination of the vectors of vi, V2, •••, v r if 
w can be expressed in the form 

w = k\Y\ 4- ^2 V 2 H 1- k? Y r 

where £ 1? £ 2 > • • •» k r are complex numbers. 

The notions of linear independence, spanning, basis, dimension, and subspace carry over without change to complex vector 
spaces, and the theorems developed in Chapter 5 continue to hold with R™ changed to C M . 

Among the real vector spaces the most important one is R n , the space of ^-tuples of real numbers, with addition and scalar 
multiplication performed coordinatewise. Among the complex vector spaces the most important one is C M , the space of 
n-tuples of complex numbers, with addition and scalar multiplication performed coordinatewise. A vector u in C" can be 
written either in vector notation, 

or in matrix notation, 



\\ = 



"2 



where 



u\ =a\ +b\i, ^2—^2 + ^2*> ---> u n — fl H 4-£ M J 



EXAMPLE 1 Vector Addition and Scalar Multiplication 



If 

then 

and 



n = (z, 1 4- i, — 2) and v = (2 I i, 1 — i, 3 I 2z) 



u + v=(s, 1+i, -2) I- (2 + h 1 — i, 3 I 20 = (2 I 2z, 2, 1 I 2f) 



i\\ = i(i, 14 i, -2) = (i 2 ,i I i 2 , - 2i) = ( - 1, - 1 \-i, -2i) 



In C" as in R* 2 , the vectors 



ei = (1, 0, 0, ..., 0), e 2 = (0, 1. 0, ..., 0), ..., e H = (0, 0, 0, ... 1) 



form a basis. It is called the standard basis for C" Since there are n vectors in this basis, C n is an ^-dimensional vector 
space. 

Remark Do not confuse the complex number j = J — \ with the vector i = (i ? o, 0) from the standard basis for p 3 (see 
Example 3, Section 3.4). The complex number i will always be set in lightface type and the vector i in boldface. 



EXAMPLE 2 Complex M 



mn 



In Example 3 of Section 5.1 we defined the vector space M mri of^xw matrices with real entries. The complex analog of this 
space is the vector space of m x n matrices with complex entries and the operations of matrix addition and scalar 
multiplication. We refer to this space as complex M myi - 



EXAMPLE 3 Complex-Valued Function of a Real Variable 



If / 1 OO an d / 2 CO are real- valued functions of the real variable jc, then the expression 

/0O=/i0O I if 7 (x) 
is called a complex-valued function of the real variable x. Two examples are 

f(x) = 2x + ix and g(x) = 2smx + icosx m 

Let V be the set of all complex-valued functions that are defined on the entire line. lf{=f^(x) \ i/aOO an< ^ 

g = g 1 (x) | ig2(x) are two suc h functions and k is any complex number, then we define the sum function f | g and the 

scalar multiple £f by 

(f + g)(*) = [/i0O I giCO] I i[/2W I g2(*)] 
(Af)(x)=i/iW i j*/ 2 CO 

For example, if f = y (^) and g = g(^) are the functions in 1, then 

(f + g)0O = (2* + 2 sin*) 4- i(x I cos*) 

(if)0O = 2;ri + i : V 3 = -x 3 I 2;ri 
((1 +z)g)0O = +0(2 sin* I i cos*) = (2 sin* — cos*) I z(2sin* I cos*) 

It can be shown that V together with the stated operations is a complex vector space. It is the complex analog of the vector 
space p{ _ oo ? M ) of real-valued functions discussed in Example 4 of Section 5.1. 



EXAMPLE 4 Complex C( - oo , oo ) 



Calculus Required 



If / OO — / 1 OO I if 2 OO i s a complex- valued function of the real variable x, then /is said to be continuous if / 1 OO an< ^ 



flfc) are continuous. We leave it as an exercise to show that the set of all continuous complex- valued functions of a real 
variable x is a subspace of the vector space of all complex-valued functions of x. This space is the complex analog of the 
vector space C( — do , do ) discussed in Example 6 of Section 5.2 and is called complex C( — do , do )• A closely related 
example is complex C[a r b], the vector space of all complex-valued functions that are continuous on the closed interval 

[a.b]. 

Recall that in R n the Euclidean inner product of two vectors 

u= («i,tt2*--->"w) ^d v = ( v l> v 2> -,v H ) 
was defined as 

u - v = a iv i + u 2 V2 + - + w H v H ) ._ 

and the Euclidean norm (or length) of u as 

INI = (u ■ u) * /2 = ^ | m 2 + bib + m 2 (3) 

Unfortunately, these definitions are not appropriate for vectors in C". For example, if 3 were applied to the vector u = (i, 1) 
in c 2 , we would obtain 

IN = |/i 3 + 1 = ^=0 
so w would be a nonzero vector with zero length — a situation that is clearly unsatisfactory. 

To extend the notions of norm, distance, and angle to C n properly, we must modify the inner product slightly. 



DEFINITION 



If ii = {u\, uj, ---, u n ) and v = (vi, vj, ---, v n ) are vectors in C" then their complex Euclidean inner product u - v is 
defined by 

u ■ v = a i v i I U2V2 H h w H v H 

where vi» V2> ■■•» v M are the conjugates of vi, V2> ■■-, v M . 



Remark Observe that the Euclidean inner product of vectors in C" is a complex number, whereas the Euclidean inner 
product of vectors in R n is a real number. 



EXAMPLE 5 Complex Inner Product 



The complex Euclidean inner product of the vectors 

u= (-i,2,l I 3i) and v= (1-z, 0,1 I 3i) 
is 



u-v=(-0(l-0 I (2)(0) + (1 + 30(1 + 3i) 
= (-0(1 I 1 (2)(0) I (1 I 30(1-30 
= -i-i 2 | 1 — 9s 2 =ll— s 



Theorem 4.1.2 listed the four main properties of the Euclidean inner product on R n . The following theorem is the 
corresponding result for the complex Euclidean inner product on C" 

THEOREM 10.4.1 



Properties of the Complex Inner Product 

Ifu, v, and w are vectors in C" and k is any complex number, then 



(a) u-v = v-u 

(b) (u + v) - w = n - w 4- v - w 

(c) (Au)-y = £(u-y) 

(d) v v > 0. Further, v . v = if and only ify — Q. 



Note the difference between part (a) of this theorem and part (a) of Theorem 4.1.2. We will prove parts (a) and (d) and leave 
the rest as exercises. 



Proof (a) Let u = (u u u 2 , ..., u n ) and v = (vi. v 2 , ..., v„)- Then 

11-Y = M1V1 I tt?VT + l~ttv,Vv, 

v - u = V]U\ + V2^2 H h v n u n 



and 



so 



v-u = viui + V2^2 H hv H w H 

= v\ui + V2^2 + ■■■ + v H z7 H [Theorem 10.2.3, parts (a) and (c)] 
= viu\ + V2^2 + — h v H w H [Theorem 10.2.3, part (e)] 

= u\v\+ u 2 V2 + '•' + W H^H 

= U- V 



Proof (d) 



v-v = vivi I V2V2+- + v H v H = |vi| 2 + |V2| 2 + -+ |v„| 2 >0 



Moreover, equality holds if and only if |vi| = |v2| = -= |v„| = 0- But this is true if and only if 
V1 =v2 = »- = v n = o; that is, it is true if and only if v = 0- 



Remark We leave it as an exercise to prove that 



u - (kv) = k(n ■ v) 
for vectors in C". Compare this to the corresponding formula 

u - (kv) = k(n - v) 
for vectors in ^ H |. 

Norm and Distance in C" 

By analogy with 3, we define the Euclidean norm (or Euclidean length) of a vector u = (ui , u% ..., u„) in C" by 

||u|| = (u - u) m = /| Ul | 2 I \u 2 \ 2 I ~- I K| 2 
and we define the Euclidean distance between the points u = (u\, u 2 , ..., w H ) an d v = (vi, v 2 , ..., v M ) by 

d = (u, v) = ||u - v|| = \j\ui-vi\ 2 I |w 2 — v 2 | 2 H h \u n -v n \ 2 



EXAMPLE 6 Norm and Distance 

If u = (i, 1 + i, 3) and v = (1 - i, 2, 4i), then 

||u|| = i/lil 2 -i- |l _!_ i| 2 _|_ |3| 2 ^ i/l + 2 + 9 = /l2 = 2/3 

and 

<a?(n. v) = ^J-Cl-01 2 I |(1 I i)-2\ 2 | |3-4i| 2 
= ^|_1 + 2!| 2 + |-l + j| 2 + |3-4i| 2 
= ^5 + 2 + 25=/32 = 4/2 

The complex vector space C M with norm and inner product defined above is called complex Euclidean n-space. 

Exercise Set 10.4 



Click here for Just Ask! 



Let a = (a, 0, -1, 3), v=(-i,i, 1 I i, -l),and w =(l I i, -z, -1 I 2z, 0)- Find 

(a) u-v 

(b) i v l 2 iv 

(c) - w + v 



(d) 3(u-(l I i)v) 

(e) - iv + 2iw 

(f) 2v-(u I w) 

Let a, v, and w be the vectors in Exercise 1 . Find the vector x that satisfies u — v + ix = 2ix. -h w- 
2. 

Let m = (1 - i, i, 0), u 2 = (2z, 1 + j, 1), and ll3 = (0, 2i, 2-i)- Find scalars c\, c 2 , and c 3 such that 

* cum I C2 U 2 I C 3 U 3 = ( — 3 + i, 3 H 2i, 3 — 4j)- 

Show that there do not exist scalars c\,C2, an d c 3 such that 
4 

* ci(j, 2 — i, 2 I I cj.(\ I :, — 2z, 2) I c^(3, i, 6 I i) = (i,i,i) 

Find the Euclidean norm of v if 

5. 

(a) v=(U) 

(b) v=(l+i,3U) 

(c) v= (2,0,2 hi, -1) 

(d) v=(-i,i,i,3,3 I 40 

Let n = (3* p 0, -i),v=(0,3 I Ai, - 2i), and w= (\ +h 2i, 0> Find 



6. 



(a) l|n + v|| 

(b) IMI I INI 

(c) ||-iu||+i||u|| 

(d) ||3u-5v + w| 



(e) _J_ W 

IKvll 



(f) IlirVwl 

it 



Show that if v is a nonzero vector in C", then (1 / ||v||)v has Euclidean norm 1. 
7. 

Find all scalars k such that \\kv\\ = 1 , where v = (3i 4i)- 
8. 

Find the Euclidean inner product u - v if 
9. 

(a) u=(-i,30>v=(3i,20 

(b) u =(3-4i,2 I i, -&z),v = (\+i,2-z,4) 

(c) u= (1 _j, 1 | i,2i r 3),v=(4 + 6i, -5i, -1 hi,i) 

In Exercises 10 and 1 1 a set of objects is given, together with operations of addition and scalar multiplication. Determine 
which sets are complex vector spaces under the given operations. For those that are not, list all axioms that fail to hold. 

The set of all triples of complex numbers ( Zl r Z2 , £3) with the operations 

and 

£(zi,Z2,z 3 ) = (tei,fe2.^3) 

The set of all complex 2x2 matrices of the form 
11- r n' 

z 
z 

with the standard matrix operations of addition and scalar multiplication. 

Use Theorem 5.2.1 to determine which of the following sets are subspaces of c 3 '- 
12. 

(a) all vectors of the form ( z? 0, 0) 

(b) all vectors of the form (z,i,i) 

(c) all vectors of the form ( Zl r Z2 , z 3 ) , where Z3 = z\ + z 2 

(d) all vectors of the form (z u z 2 ,z 3 ), where Z3 = Z{ + Z2 + j 

Let T: c 3 > C 3 be a linear operator defined by 7( x ) = Ax, where 

13. 



,4 = 



i —i — 1 
1 —i l+i 
1-j 1 



14. 



Find the kernel and nullity of T. 

Use Theorem 5.2.1 to determine which of the following are subspaces of complex M22'' 



(a) All complex matrices of the form 



*1 ^2 
Z3 Z4 



where ^ and z 2 are real, 
(b) All complex matrices of the form 



*1 ^2 
Z3 Z4 



where Zl \ Z4 = Q- 



(c) All 2x2 complex matrices A such that (^4) = A where ^4 is the matrix whose entries are the conjugates of the 
corresponding entries in A. 



Use Theorem 5.2.1 to determine which of the following are subspaces of the vector space of complex- valued functions 
15. of the real variable x: 



(a) all / such that / ( 1 ) = 



(b) all /such that f (Q) = i 



(c) aU/such that /(-*)=/ 00 



(d) all /of the form ^ | £ 2 e 3 *' w h ere k\ an d £2 are com pl ex numbers 



16. 



Which of the following are linear combinations of u = (j 9 — i,3i) and v = (2i, 4i, 0) 



(a) (3i,3i,3i) 



(b) (4i,2i,60 

(c) (i,S,S) 



(d) (0,0,0) 



Express the following as linear combinations of n — (1 ; 0, — i), v = (1 h i, 1, 1 — 2i), and w = (0, i, 2)- 
17. 



(a) (1, 1, 1) 

(b) (i,0, -0 

(c) (0,0,0) 

(d) (2-i, 1, 1+0 

In each part, determine whether the given vectors span c 3 . 
18. 

(a) vi = (i, i, i)> v 2 = (2J, 2i, 0), v 3 = (3i, 0, 0) 

(b) vi = (l + i.2-i P 3 I j)'V 2 = (24 % 0,1-0 

(c) vi = (1,0, -0.v 2 = Cl + i, 1, l-2i),v3 = C0,i, 2) 

(d) Vl = (U,0),v 2 = (0, -i, l),v 3 = (l,0, 1) 

Determine which of the following lie in the space spanned by 
l9 ' f = e ix and g = e~ ix 

(a) cost: 

(b) sin* 

(c) cos* I 3isinx 

Explain why the following are linearly dependent sets of vectors. (Solve this problem by inspection.) 
20. 

(a) Ul = (l-j, and „ 2 = (1 | j, - 1) in c 2 



(b) Ul = (l, -0.u 2 = (2 I i, -l),u 3 = (4,0)inc? 



& A = 



i 1i 


and 5 = 


"1 3" 


_2i 0_ 




_2 0_ 



in complex jy 2 2 



21. 



Which of the following sets of vectors in c 3 are linearly independent? 



(a) m = (1 - i, 1, 0), u 2 = (2, 1 + i, 0), u 3 = (1 + i, i, 0) 



(b) i M = (l, 0, - 0, u 2 = (1 + U, 1 - 2i). 113 = (0, U 2) 



(c) i M = (i, 0, 2 - 0* U2 = (0,1, 0, u 3 = ( - i, - 1 - **> 3 ) 



Given the vectors v = [a, b + ci] and w = [b — ci, a], for what real values of a, b, and c are the vectors v and w linearly 

22. dependent? 

Let V be the vector space of all complex-valued functions of the real variable x. Show that the following vectors are 

23. linearly dependent. 

2 2 2 2 

f = 3-\ 3i cos 2*, g = sin x I i cos x 9 h=cos x — z sin x 

Explain why the following sets of vectors are not bases for the indicated vector spaces. (Solve this problem by 

24. inspection.) 



(a) iii = (}, 20' U2 = (0, 30^ 113 = (1, 70 for C 2 



(b) Ul = (_i | j,0,2-0»ii2 = Cl- -U I for c 3 



25. 



Which of the following sets of vectors are bases for 



(a) (2i, -i), (4i,0) 

(b) (1 | U), (l+j,0 

(c) (0,0), (l hi, 1-i) 



26. 



(d) (2-3i,i), (3 I 2i, - 1) 
Which of the following sets of vectors are bases for c* 3 ? 

(a) (i.0,0), (i.i.O), (i.i.O 

(b) (1,0, -i), (1 I i, 1, 1-2;)' (0,i,2) 

(c) (i, 0,2-0.(0, 1,0, C-i, -1-4;, 3) 

(d) (1,0,0' (2-J, 1,2 I 0'(0.3i,30 

In Exercises 27-30, determine the dimension of and a basis for the solution space of the system. 

*1 + (1+0*2 = 
21 ' (1-0*1+ 2*2 = 

2*i -(1 I 0*2 = 

(-11 0*1 + *2 = 

x\ + (2 — 0*2 = 

*2 + iiX2 = U 

z*l I- (2 4- 2z)*2 + 3^3 = 

*i 4- ix7 — 2ix2+ *4 = 
30. 

ix\ I 3*2 I 4*3 — 22*4 = 

Prove: If u and v are vectors in complex Euclidean n-space, then 

ii- (Ay) = A(u- v) 

32. 

(a) Prove part (b) of Theorem 10.4. 1 . 

(b) Prove part (c) of Theorem 10.4.1. 
Establish the identity 

' „-v=I||„ + y|| 2 -I||„-v|| 2 | l||„~Mv|| 2 -l-||„-iv|| 2 

for vectors in complex Euclidean n-space. 
34. (For Readers Who Have Studied Calculus) Prove that complex C( — oo , do ) is a subspace of the vector space of 



complex-valued functions of a real variable. 
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1/\ C In Section 6. 1 we defined the notion of an inner product on a real vector 

" ■ O space by using the basic properties of the Euclidean inner product on R n as 

COM PLEX INNER axioms. In this section we shall define inner products on complex vector 

P R O D I J TT S P A C F S spaces by using the properties of the Euclidean inner product on C n as 

axioms. 



Complex Inner Product Spaces 

Motivated by Theorem 10.4.1, we make the following definition. 



DEFINITION 




An inner product on a complex vectc 


)r space V is a function that associates a complex number (u, v\ with each pair of 


vectors u and v in V in such a way that the following axioms are satisfied for all vectors w, v, and w in V and all scalars k. 


(a) (u,v} = (v,u} 


(b) (u 1 v,w} = (u,w) 1 (v,w} 


(c) (iu,v} = i(u,Y} 


(d) ( v, v} > and (v, vj = if and only if v = 



A complex vector space with an inner product is called a complex inner product space. 
The following additional properties follow immediately from the four inner product axioms: 

(i) (0,Y} = (Y,0} = 

(ii) (u, V + w) = (u, v} I (u, w) 
(iii) t\\, kv) = kt\\, v) 



Since only (iii) differs from the corresponding results for real inner products, we will prove it and leave the other proofs as 
exercises. 



(u, kv} = (£v, uj 
= £(v, uj 

= £{u,v} 



[Axiom 1] 

[Axiom 3] 

[Property of conjugates] 

[Axiom 1] 



EXAMPLE 1 Inner Product on C r 



Let u = (hi, H2> ---> u n) an d v = (vi, V2, ---, v H ) be vectors in C". The Euclidean inner product 
(u, v) = u - v = ujvi 4- H2V2 H h u„v n satisfies all the inner product axioms by Theorem 10.4.1. 



EXAMPLE 2 Inner Product on Complex M 2 2 



If 



u = 



Hi H2 

H3 H4 



and V = 



vi v 2 
v 3 v 4 



are any 2x2 matrices with complex entries, then the following formula defines a complex inner product on complex M22 
(verify): 

(U, V) = u\v\ +H2V2 + H3V3+H4V4 

For example, if 



U = 



i 

1 1+j 



and V = 



1 -i 
2: 



then 



(y.r} = (0)(i) 1 i(-i) 1 (i)(0) + (1+0(20 
= (0)(i)+j(0 + (i)(0) + (i+0(-20 

= 04j 2 I 0-2j-2j 2 
= l-2s 



EXAMPLE 3 Inner Product on Complex C [a, b] 



Calculus Required 



If / (x) = / 1 (x) 4- if 2 CO is a complex-valued function of the real variable x, and if f j (^ ) and f 2 {x) are continuous on 
[a, £] then we define 

/ /C0rf*= / [/iCO I i/ 2 W]^= / fiWdx + if f 2 (x)dx 
In words, the integral off (x) is the integral of the real part of f plus i times the integral of the imaginary part off 



We leave it as an exercise to show that if the functions f = f 1 (x) \ z/ 2 00 an( ^ £ = g\ 00 I ig2 00 are vectors i n complex 
C\a, b] then the following formula defines an inner product on complex C\a, b] - 

(f.«} = f [/lW I i/2W][glW I iE2k)]dx 
= [ [/lW I i/2W][giW-ig 2 W]^ 

In complex inner product spaces, as in real inner product spaces, the norm (or length) of a vector w is defined by 

||u||=(u,u} 1/2 

and the distance between two vectors u and v is defined by 

d(u, v) = ||u — v]| 

It can be shown that with these definitions, Theorems 6.2.2 and 6.2.3 remain true in complex inner product spaces (Exercise 

35). 



EXAMPLE 4 Norm and Distance in C n 

If u = (u\,U2,..-,Uyj) and v = ( Vl? V2, ..., v H ) are vectors in C M with the Euclidean inner product, then 

Hull = (u, u} 1/2 = l/|^i| 2 I | a2 | 2 + ...+ |a H | 2 

and 

1 /2 
d(\\, v) = ||u — v|| = hi — v, u — v\ 

= V\ui-vi\ 2 I |w 2 -v 2 | 2 I h|w H -v H | 2 

Observe that these are just the formulas for the Euclidean norm and distance discussed in Section 10.4. 



EXAMPLE 5 Norm of a Function in Complex C\0, 2tt] 



Calculus Required 



If complex C [ 0, 2k] has the inner product of Example 3, and if f — & imx , where m is any integer, then with the help of Formula 
15 of Section 10.3, we obtain 

1/2 



= ({J) U2 = 



r 
/ 

Jo 



e ! ™e ! ™dx 



e l ™e-™*dx 



1/2 



■ f 2n 
/ dx 

Jo 



1/2 



= iJ2x 



Orthogonal Sets 



The definitions of such terms as orthogonal vectors, orthogonal set, orthonormal set, and orthonormal basis carry over to 
complex inner product spaces without change. Moreover, Theorems 6.2.4, 6.3.1, 6.3.3, 6.3.4, 6.3.5, 6.3.6, and 6.5.1 remain 
valid in complex inner product spaces, and the Gram-Schmidt process can be used to convert an arbitrary basis for a complex 
inner product space into an orthonormal basis. 



EXAMPLE 6 Orthogonal Vectors in c 2 



The vectors 

u= (z, 1) and v= (1, i) 
in ^2 are orthogonal with respect to the Euclidean inner product, since 

n-v=(0(T) I (i)(0 = (0(i) I (i)(-0 = o 



EXAMPLE 7 Constructing an Orthonormal Basis for £3 



Consider the vector space q3 with the Euclidean inner product. Apply the Gram-Schmidt process to transform the basis 
vectors Ul = (i, i, i), u 2 = (0, i, i), 113 = (0, 0, i) into an orthonormal basis. 



Solution 

Stepl. vi =ui = (i, i, i) 



v 2 = u 2 - projf^j u 2 = u 2 - -J ^- vi 

Step 2. H y lH 



Step 3. 



2 ' 2 
Thus 



("3, vi } f"3,V2) 
v 3 = 113 - projffr 113 = 113 - -> J- vi - -> j- v 2 

llvill 2 ||v 2 || 2 

= C0,0,O-|(i.i,O--i^-(-|i,ii.ij 



v i (U,0, v 2 [-ji, ji, ji) v 3 = [0, - -i, - i 



form an orthogonal basis for c 3 - The norms of these vectors are 

IIV3II = 



vil| = /3, ||v 2 ||=J^ 11^311 7= 



so an orthonormal basis for c 3 is 



[1 / i i i \ V2 _ / _ 2; i i 



ri " W ft' l/3/ l|V2 " I ^ ? ^ ? fi, 

-^3_ = (q _ J i_) 

Iv 3 ll ^ fz' fzj 



EXAMPLE 8 Orthonormal Set in Complex C[0, 2ir] 



Calculus Required 



Let complex C[0 7 2tt] have the inner product of Example 3, and let Wbe the set of vectors C[0 7 2ir] in of the form 
where m is an integer. The set W is orthogonal because if 
are distinct vectors in W, then 



gimx _ cos ^^ _|_ j sm mx 



f = e ik * and £ = e' 7 * 



(f, g}= / *e ikx e ih dx=f \ ikx e~ ih dx = f "e^^dx 

p2ir p2-7T 

= I cos(k — l)x dx -\- i f sm(k — f)xdx 
JO JO 



2- 




1 i 2?r r 1 

-5 sin(£ — /)* — z -5 cos (k — I)x 

fc — I Jo |_ £ — / 

= (0)-i(0)=0 
If we normalize each vector in the orthogonal set W, we obtain an orthonormal set. But in Example 5 we showed that each 



vector in W has norm ^2?r, so the vectors 

-j^e 1 ™, m = 0. :bl, ±2... 
fix 

form an orthonormal set in complex C[0, 2ir] . 



Exercise Set 10.5 



Click here for Just Ask! 



Let u = (u i „ U2) an ^ v = (v 1 , V2) • Show that /u, y\ = 3u\v\ I 2z^2 defines an inner product on c J 



Compute (u, vj using the inner product in Exercise 1. 

2. 



(a) u= (2i, -i),v=(-i,3i) 

(b) u=(0, 0),v=(l-*,7-50 

(c) u =(l I i, l-j)'V=Cl-j, 1+0 

(d) u=(3i, -1 I 2i),v=(3i, -1 + 20 

Let u = (u i , u 2 ) and v = ( v 1 - v 2) • Show that 
3 _ _ _ _ 

(u, v} = wivi + (1 +0"1V2 I (1-0^2^1 f 3w 2 V2 

defines an inner product on c 2 . 

Compute (u, v) using the inner product in Exercise 3. 
4. 

(a) u=(2i, -0'V=(-*, 30 

(b) u=(0,0),v= (1-^7-50 

(c) u=(l I i, \-i), y =(l-i 9 1 +0 

(d) u=(3i, -1 I 2i),Y=(3i, -1 + 20 



Let u = (u\ 7 U2) an d v = (vi, V2)- Determine which of the following are inner products on ^2. For those that are not, list 
*• the axioms that do not hold. 



(a) (u, v} = uivi 

(b) (u ? v} = wivi -u 2 V2 

(c) (u ? v}=|^i| 2 |v 1 | 2 I M 2 M 2 

(d) (u, v} = 2zqvi I iu\V2 I iu2V\ -\-2u2V2 

(e) (u, vj = 2uivi I iu\V2 — i^2^l I 2z^2 



Use the inner product of Example 2 to find ( U r V} if 



U = 



-i 1 +i 
1 — i i 



and V = 



3 -2-3i 
4i 1 



Let u = (u u U2? U3 ) and v = ( Vl? v 2 , V3). Does (u, v} = uivi I W2V2 I U3V3-JU3V1 define an inner product on c 3 ? If not, 
7 • list all axioms that fail to hold. 

Let V be the vector space of complex- valued functions of the real variable x, and let f = f 1 (*) | j/ 2 (x) and 
*• g = gi (x) 1 ig20O be vectors in V. Does 

(f,g}=C/i(0) I i/ 2 (0))(gi(0) I ig 2 (0)) 

define an inner product on V? If not, list all axioms that fail to hold. 

Let c 2 have the inner product of Exercise 1. Find ||w|| if 



(a) w = ( - i, 3i) 

(b) w =(l-i, 1+0 

(c) w= (0.2-i) 

(d) w=(0, 0) 



10. 



For each vector in Exercise 9, use the Euclidean inner product to find \\w\\ 



11. 



Use the inner product of Exercise 3 to find ||w|| if 

(a) w =(l, -0 

(b) w=(l-i, 1+0 

(c) w=(3-4i.O) 

(d) «r=(0, 0) 



12. 



Use the inner product of Example 2 to find || j4|| 



& A = 



— i 7i 
6i 2i 



(b) ,_ 



-1 1+i 
1-i 3 



13. 



Let c 3 have the inner product of Exercise 1. Find ^( x , y) if 



(a) x=(l, l).y=(i p -0 



(b) x=(l— i, 3 I 2i),y=(l | i, 3) 



14. 



Repeat the directions of Exercise 13 using the Euclidean inner product on q2\. 



15. 



Repeat the directions of Exercise 13 using the inner product of Exercise 3. 



16. 



Let complex Mji nave tne mner product of Example 2. Find d(A, B) if 



(a) 



(b) 



A = 



A = 



i 5i 
Si Zi 


and 5 = 


"-5i 
7i 


0" 
-3i_ 




' -1 
1+i 


2 


and 


B = 


"2; 2- 
i 


-Zi 
1 



17. 



Let c 3 have the Euclidean inner product. For which complex values of k are u and v orthogonal? 



( a ) n = (2i, i, 3z)' v = (z, 6z, A:) 



(b) u =(*,£, 1 I 0'V=(1, -1, 1-0 



18. 



Let complex jy 2 2 have the inner product of Example 2. Determine which of the following are orthogonal to 



,4 = 



2i i 
-i 3i 



(a) 



(b) 



-3 1 


— i 


1-i 2 


1 r 




-1 





(c) 








(d) 



1 
3-z 



19. Let c 3 have the Euclidean inner product. Show that for all values of the variable 0, the vector x = e } \—j=, —j=, —f= has 
norm 1 and is orthogonal to both (1, z, 0) and (0, i, — j). 

Let c 2 have the Euclidean inner product. Which of the following form orthonormal sets? 
20. 

(a) (i,0), (0.1-0 



(b) 






i i 



c l^w(~^~^ 



(d) (i,0),(0,0) 



21. 



Let c 3 have the Euclidean inner product. Which of the following form orthonormal sets? 



(a) 



k*- h)\k k~ h)\~ k*' h> 



{h) [¥-¥\^[¥^-m^¥¥) 



v c ) / i i 2i \ i i i 



.f'F fi)\fi' ft 



Let 



22. 



x = 



i _ _J_ \ ^ _ / 2i 3; 

{5' fij ' " ^ l /30' l /30 j 



Show that {x ? y} is an orthonormal set if ^2 has the inner product 



but is not orthonormal if C 2 has the Euclidean inner product. 

Show that 

ui = (i, 0, 0, 0, ii2 = C-i, 0, 2i,i), U3 = (2i,3i,2i, -2i), ii4=(-i, 2i, -i,i) 
is an orthogonal set in c 4 with the Euclidean inner product. By normalizing each of these vectors, obtain an orthonormal 
set. 

Let c 2 have the Euclidean inner product. Use the Gram-Schmidt process to transform the basis ( Ul? u 2 } into an 
^' orthonormal basis. 



(a) m = (i, — 30» U2 = (2i, 2i) 

(b) ui = (i, 0), u 2 = (3i, -50 



Let c 3 have the Euclidean inner product. Use the Gram-Schmidt process to transform the basis (u 1? ^ 113} into an 
25- orthonormal basis. 



( a ) iii — (i> h 0' u 2 — ( — h i, 0)? 113 = (z, 2z, 

(b) ui = (i, 0, 0), u 2 = (3i, 7i, - 2i), 113 = (0, 4i, i) 

Let c 4 have the Euclidean inner product. Use the Gram-Schmidt process to transform the basis ( Ul? 112, 113, 114} into an 
2" - orthonormal basis. 

ui = (0, 2i, i, 0), ll 2 — (}> — z, 0, 0), U3 = (z, 2z ? 0, —i), U4=(z, 0, z, 

Let c 3 have the Euclidean inner product. Find an orthonormal basis for the subspace spanned by (0 r i r 1 — i) and 
27 " (-i,0,l I 0- 

Let g^ have the Euclidean inner product. Express the vector w — ( _ ^ 2z, 6z ? 0) in the form w = wi I W2> where the vector 
2°- wi is in the space W spanned by Ul = ( _ j r 0, z, 20 and u 2 = (0, z, 0, 0' anc ^ w 2 * s or thogonal to W. 



29. 

(a) Prove: If k is a complex number and (1 1, v)is an inner product on a complex vector space, then 

l\\ — kv, u — kv\ = /u, 11 J — jfc(u, v\ — jt(u, v\ 4- kkiv, v\. 



(b) Use the result in part (a) to prove that < {u, 11 J — k{\\, vj — £{u, vj I kkfy, vj. 



Prove that if w and v are vectors in a complex inner product space, then 

|(U,V}| 2 <(U,U}(Y,V} 



This result, called the Cauchy-Schwarz inequality for complex inner product spaces, differs from its real analog 
(Theorem 6.2.1) in that an absolute value sign must be included on the left side. 

Hint Let k = (u, vj / ( v, v J in the inequality of Exercise 29(b). 

Prove: If u= (zq, U2, .-., u„) and v = (vi, V2, -.., v H ) are vectors in C M , then 

31 

1/2 1/2 

luivH ^V2 + --- + a„vJ<(|^| 2 +|^| 2 + --H \u„\ 2 ) (IvtI 2 I |v^| 2 H h |v„| 2 ") 

This is the complex version of Formula 4 in Theorem 4.1.3. 

Hint Use Exercise 30. 

Prove that equality holds in the Cauchy-Schwarz inequality for complex vector spaces if and only if u and v are linearly 
32. dependent. 

Prove that if (u, v} is an inner product on a complex vector space, then 
33 " (0,v} = (v,0} = 

Prove that if (u, vj is an inner product on a complex vector space, then 
34 " (u,v + w} = (u,v} + (u,w} 

Theorems 6.2.2 and 6.2.3 remain true in complex inner product spaces. In each part, prove that this is so. 
35. 

(a) Theorem 6.2.2a 

(b) Theorem 6.2.2b 

(c) Theorem 6.2.2c 

(d) Theorem 6. 2. 2d 

(e) Theorem 6.2.3a 

(f) Theorem 6.2.3b 

(g) Theorem 6.2.3c 
(h) Theorem 6.2.3d 



36. 



In Example 7 it was shown that the vectors 



vi \fckh) V2 \f'kh) T3 vfcfci 



form an orthonormal basis for c 3 . Use Theorem 6.3.1 to express u = (1 — i, 1 I i, 1) as a linear combination of these 
vectors. 



Prove that if u and v are vectors in a complex inner product space, then 

(U,V} = 1||U I y|| 2 -l||„-y|| 2 + I.||„ | iv|| 2 -^||ll-jv|| 2 



Prove: If { Vl ? y 2 , - - -, v H } is an orthonormal basis for a complex inner product space V , and if u and w are any vectors in 
38 - V, then 



{u,w} = (u,vi}{w,vi} I {U,V2}{W,V2}+-+{U,V M }(W,V H } 

Hint Use Theorem 6.3.1 to express u and w as linear combinations of the basis vectors. 



39. (For Readers Who Have Studied Calculus) Prove that if f = f ^ (*) | ifjix) an d g = g± (*) | zg 2 (*) are vectors in 
complex C[a, i] then the formula 



defines a complex inner product on C|>,£]. 



40. (For Readers Who Have Studied Calculus) Let f = x and g = 1 -I ix be vectors in complex C[0 7 1 ] and let this space 
have the inner product defined in Exercise 39. Find 

(a) Hell 

(b) (f , g} 

(c) (B-f) 



41. (For Readers Who Have Studied Calculus) Let f = f 1 (*) | i/2(x) an d g = gi (*) I ig20O ^ e vectors i n complex 
C[0 7 1 ] and let this space have the inner product defined in Exercise 39. Show that the vectors g 2 ^™*, where m = Q, ±1. 
j_ 2, • • ., form an orthonormal set. 
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10.6 

UNITARY, NORMAL, 
AND HERMITIAN 
MATRICES 



For matrices with real entries, the orthogonal matrices ( A~ 1 = A T ) and the 
symmetric matrices (A = A T ) played an important role in the orthogonal 
diagonalization problem (Section 7.3). For matrices with complex entries, the 
orthogonal and symmetric matrices are of little importance; they are 
superseded by two new classes of matrices, the unitary and Hermitian 
matrices, which we shall discuss in this section. 



Unitary Matrices 

If A is a matrix with complex entries, then the conjugate transpose of A, denoted by j4 * , is defined by 



,4* =A T 



-*T ■ 



where A is the matrix whose entries are the complex conjugates of the corresponding entries in A and A is the transpose of A- 
The conjugate transpose is also called the Hermitian transpose. 



EXAMPLE 1 Conjugate Transpose 



If 



so 



A = 



1 + i -i 
2 3 — 2i i 



then A = 



x — 1 z 

2 3 I 2i -i 



A*=A T = 



\-i 


2 


i 


3 + 2i 





— i 



The following theorem shows that the basic properties of the conjugate transpose are similar to those of the transpose. The 
proofs are left as exercises. 



THEOREM 10.6.1 



Remark Recall from Formufa 7 of Section 4.1 that if u and v are column vectors in ^", then the Euclidean inner product on p* 
c ^&cffiP&h¥$iffifii ■ v = ii^Wj^to^ vectors in C" then the Euclidean 

inner product on C n can be expressed as u . v — u * v - 

(a) (A*)* =A 

Recall that a matrix with real entries is called orthogonal if ^4 _1 = ^4^. The complex analogs of the orthogonal matrices are 

called unitary matrices. They are defined as follows: 

(b) (A I B) =A +B 




A square matrix A with complex entries is called unitary if 



A~ l =A* 



The following theorem parallels Theorem 6.6.1. 



THEOREM 10.6.2 



Equivalent Statements 










If A is an ft xtt matrix with complex entries, then the following 


are equivalent. 




(a) A i s unitary. 










(b) The row vectors of A farm an 


orthonormal set in C n 


with the Euclidean inner product. 


(c) The column vectors of A farm an orthonormal set in 


C n 


with the Euclidean 


inner product. 



EXAMPLE 2 A 2 x 2 Unitary Matrix 



The matrix 



,4 = 



1 +; 1 +i 
2 2 

1 -; - 1 I i 
2 2 



has row vectors 

Relative to the Euclidean inner product on C", we have 

' i+i 2 =l /X7i =1 

V 2 2 



|i'lll = 



1+i 



l 



|l'2ll = 



1-i 



-1+i 



=^R= 



and 






i+j 



2 2 



so the row vectors form an orthonormal set in c 2 - Thus A is unitary and 



(1) 



A- l =A* = 



1 -; 1 +i 
2 2 

1-i -1-i 



(2) 



2 2 

The reader should verify that matrix 2 is the inverse of matrix 1 by showing that j[j[ * = A*A = 1- 




Charles Hermite (1822-1901) was a French mathematician who made fundamental contributions to algebra, matrix theory, 
and various branches of analysis. He is noted for using integrals to solve a general fifth-degree polynomial equation. He also 
proved that the number e (the base for natural logarithms) is a transcendental number — that is, a number that is not the root 
of any polynomial equation with rational coefficients. 



Recall that a square matrix A with real entries is called orthogonally diagonalizable if there is an orthogonal matrix p such that 
P~ l AP( =P T AP) is diagonal. For complex matrices we have an analogous concept. 



DEFINITION 



>-l 



A square matrix A with complex entries is called unitarily diagonalizable if there is a unitary p such that P AP( = P AP) 
is diagonal; the matrix p is said to unitarily diagonalize j$. 



We have two questions to consider: 



Which matrices are unitarily diagonalizable? 



How do we find a unitary matrix p to carry out the diagonalization? 



Before pursuing these questions, we note that our earlier definitions of the terms eigenvector, eigenvalue, eigenspace, 
characteristic equation, and characteristic polynomial carry over without change to complex vector spaces. 

Hermitian Matrices 



In Section 7.3 we saw that the problem of orthogonally diagonalizing a matrix with real entries led to consideration of the 
symmetric matrices. The most natural complex analogs of the real symmetric matrices are the Hermitian matrices, which are 
defined as follows: 



DEFINITION 



A square matrix A with complex entries is called Hermitian if 

A = A 



EXAMPLE 3 A 3 x 3 Hermitian Matrix 



If 



then 



A = 



which means that A is Hermitian. 



A = 



1 -i 1-i 
i -5 2 I i 
1 + i 2-i 3 



1 i 1 + i 
— i — 5 2 — i 
1-i 2 f i 3 



so 



* —T 
A =A = 



1 i 1 +i 

— i —5 2 — i 
1-i 2 I i 3 



= A 



It is easy to recognize Hermitian matrices by inspection: As seen in 3, the entries on the main diagonal are real numbers, and the 
"mirror image" of each entry across the main diagonal is its complex conjugate. 



1 - » 2 + i 3 



(3) 



Normal Matrices 

Hermitian matrices enjoy many but not all of the properties of real symmetric matrices. For example, just as the real symmetric 
matrices are orthogonally diagonalizable, so we shall see that the Hermitian matrices are unitarily diagonalizable. However, 
whereas the real symmetric matrices are the only matrices with real entries that are orthogonally diagonalizable (Theorem 7.3.1), 
the Hermitian matrices do not constitute the entire class of unitarily diagonalizable matrices; that is, there are unitarily 
diagonalizable matrices that are not Hermitian. To explain why this is so, we shall need the following definition: 



DEFINITION 



A square matrix A with complex entries is called normal if 



AA =A A 



EXAMPLE 4 Hermitian and Unitary Matrices 



Every Hermitian matrix A is normal since AA * — AA = ^4*^4, an d every unitary matrix ^4 is normal since AA * — / — A * j\. 
The following two theorems are the complex analogs of Theorems 7.3.1 and 7.3.2. The proofs will be omitted. 
THEOREM 10.6.3 



Equivalent Statements 

If A is a square matrix with complex entries, then the following are equivalent: 

(a) A is unitarily diagonalizable. 

(b) A has an orthonormal set of n eigenvectors. 

(c) A is normal. 



THEOREM 10.6.4 



If A is a normal matrix, then eigenvectors from different eigenspaces of A a™ orthogonal. 



Theorem 10.6.3 tells us that a square matrix A with complex entries is unitarily diagonalizable if and only if it is normal. 
Theorem 10.6.4 will be the key to constructing a matrix that unitarily diagonalizes a normal matrix. 

Diagonalization Procedure 

We saw in Section 7.3 that a symmetric matrix A is orthogonally diagonalized by any orthogonal matrix whose column vectors 
are eigenvectors of A- Similarly, a normal matrix A is diagonalized by any unitary matrix whose column vectors are eigenvectors 
of A- The procedure for diagonalizing a normal matrix is as follows: 

Step 1. Find a basis for each eigenspace of A- 

Step 2. Apply the Gram-Schmidt process to each of these bases to obtain an orthonormal basis for each eigenspace. 

Step 3. Form the matrix p whose columns are the basis vectors constructed in Step 2. This matrix unitarily diagonalizes A- 

The justification of this procedure should be clear. Theorem 10.6.4 ensures that eigenvectors from different eigenspaces are 
orthogonal, and the application of the Gram-Schmidt process ensures that the eigenvectors within the same eigenspace are 
orthonormal. Thus the entire set of eigenvectors obtained by this procedure is orthonormal. Theorem 10.6.3 ensures that this 
orthonormal set of eigenvectors is a basis. 



EXAMPLE 5 Unitary Diagonalization 



The matrix 



A = 



2 1+i 
1-i 3 



is unitarily diagonalizable because it is Hermitian and therefore normal. Find a matrix p that unitarily diagonalizes p. 



Solution 

The characteristic polynomial of A is 

det(A/ - j4) = det 

so the characteristic equation is 

and the eigenvalues are A = 1 and \ — 4. 
By definition, 



A-2 -1-i 
-1 I i A-3 



= (A-2)(A-3)-2 = A J -5A I 4 



A^-5A+4 = (A-l)(A-4) = 



x = 



*1 
*2 



will be an eigenvector of A corresponding to A if and only if x is a nontrivial solution of 



A-2 -1-i 
-1+i A-3 



*l 
*2 



(4) 



To find the eigenvectors corresponding to \ = 1, we substitute this value in 4: 



-1 -1-i 
-IN -2 



*2 



Solving this system by Gauss-Jordan elimination yields (verify) 

Thus the eigenvectors of A corresponding to X = 1 are the nonzero vectors in c 2 of the form 



x = 



(-l-l)s 
s 



= s 



1-i 
1 



Thus this eigenspace is one-dimensional with basis 



u = 



-1-i 

1 



In this case the Gram-Schmidt process involves only one step: normalizing this vector. Since 

||u|| = /|-l-i| 2 | |1| 2 = ^2 + 1 = ft 
the vector 

-1-i 



1>1 



u 

Ml 



1 



(5) 



is an orthonormal basis for the eigenspace corresponding to A = 1 . 



To find the eigenvectors corresponding to A = 4, we substitute this value in 4: 



2 -1-j 

Solving this system by Gauss-Jordan elimination yields (verify) 



*2 



X2 = s 



so the eigenvectors of A corresponding to \ — 4 are the nonzero vectors in c 2 of the form 



x = 



1+i 





\ 1+i] 


= s 


2 




1 



Thus the eigenspace is one-dimensional with basis 



\i — 



1+i 
2 

1 



Applying the Gram-Schmidt process (that is, normalizing this vector) yields 

1+i 



P2 = 



u 



l«ll 



Thus 



^=[PllP2l = 



2 
fs 



— 1 — i 1 + i 

ft ft 



diagonalizes A and 



P~ l AP = 



1 



1 
4 



/6 



Eigenvalues of Hermitian and Symmetric Matrices 

In Theorem 7.3.2 it was stated that the eigenvalues of a symmetric matrix with real entries are real numbers. This important 
result is a corollary of the following more general theorem. 



THEOREM 10.6.5 



The eigenvalues of a Hermitian matrix are real numbers. 



Proof If A is an eigenvalue and v a corresponding eigenvector of an n x n Hermitian matrix & then 

Av = Av 
If we multiply each side of this equation on the left by v * and then use the remark following 



Theorem 10.6.1 to write v*v=||v|| 2 (with the Euclidean inner product on c"), then we obtain 

v*A T = y*(Av) =Av*v = A||y|| 2 
But if we agree not to distinguish between the l x 1 matrix V *A- and its entry, and if we use the fact 
that eigenvectors are nonzero, then we can express \ as 

v Ay 



X = 



(6) 



Thus, to show that \ is a real number, it suffices to show that the entry of y *Av is real. One way to do 
this is to show that the matrix v *Av is Hermitian, since we know that Hermitian matrices have real 
numbers on the main diagonal. However, 

( v * Av) = v * ^4 * f v * ) = v * Av 
which shows that y *Av is Hermitian and completes the proof. 



The proof of the following theorem is an immediate consequence of Theorem 10.6.5 and is left as an exercise. 



THEOREM 10.6.6 



The eigenvalues of a symmetric matrix with real entries are real numbers. 



Exercise Set 10.6 



& 



Click here for Just Ask! 



In each part, find j[ * . 



(a) 



A = 



2i 1 — i 
4 3 + i 
5 + i 



(b) 



A = 



2i 1 — j —Hi 
4 5-7; -i 
i 3 1 



( C ) A= [H -3i] 



(d) 



A = 



.321 «22 «23 



Which of the following are Hermitian matrices? 



(a) 



i 
i 2 



(b) 



1 1+i 

1-i -3 



(c) 



— i i 



(d) 



-2 1-i 
1+i 

-1-i 3 



1+i 

3 

5 



(e) 



"1 





0" 





1 











1 



3. 



Find fa l, and m to make A a Hermitian matrix. 



^4 = 



-1 k -i 
3-5i m 
I 2 I 4; 2 



Use Theorem 10.6.2 to determine which of the following are unitary matrices. 



(a) 
In each 



i 
p@rt,jv|erify that the matrix is unitary and find its inverse. 



u 



1 *i 

5 5 
5 5 



fe {2 



(b) 
(c) 



1 

/2 


1 


1+i 

2 


1+i 
2 



(S 



, 1 



{l {I {3 



(d) 



1+i 


1 


1 


2 


2 


2 


i 

/3 


1 
/3 


i 


3 i i 


4 + 3J 


5i 



2/T5 2/15 2/15 



Show that the matrix 



1 






is unitary for every real value of (}. 
In Exercises 7-12 find a unitary matrix p that diagonalizes ^4, and determine P~ l AP- 



7. 



,4 = 



4 1-i 
1+i 5 



,4 = 



3 -i 
i 3 



9. 



A = 



5 2 I 2i 
2-2; 4 



10. 



3 + J 
3-i -3 



11. ,4 = 





-1 
1-i 





1+i 





12. 



,4 = 



■f2 

i 
f2 



i 


i 


2 








2 



13. 



Show that the eigenvalues of the symmetric matrix 



,4 = 



1 Ai 
Ai 3_ 

are not real. Does this violate Theorem 10.6.6? 



14. 



(a) Find a 2 x 2 matrix that is both Hermitian and unitary and whose entries are not all real numbers. 



(b) What can you say about the inverse of a matrix that is both Hermitian and unitary? 



15. 



Prove: If A is an fl x a matrix with complex entries, then det(j4) = det(j4). 

Hint First show that the signed elementary products from A are the conjugates of the signed elementary products from A> 



16. 



( a ) Use the result of Exercise 15 to prove that if A is an fl x a matrix with complex entries, then det(A ) = det(j4). 



(b) Prove: If A is Hermitian, then det(y4) is real. 



(c) Prove: If det(>4) is unitary, then |det(j4) | = 1|. 



17. 



Prove that the entries on the main diagonal of a Hermitian matrix are real numbers. 



18. 



Let 

ill *12 *13 

A= 321 fl 22 fl 23 and B= £21 ^22 ^23 

*31 ^32 ^33 
be matrices with complex entries. Show that 



an 


<*12 


«13" 


<*21 


«22 


«23 


<331 


«32 


a 32 



(a) (.4 ) =,4 



* . * _* 



(b) (A + B) =A + B 



* ~r , * 



(c) (yb4) = yb4 



* _* . * 



(d) (AB) =B A 



19. 



Prove: If A is invertible, then so is a , in which case (A ) = (A ) 



*, — l , j * 



20. 



Show that if ^ is a unitary matrix, then ^4 * is also unitary. 



Prove that an w x « matrix with complex entries is unitary if and only if its rows form an orthonormal set in C" with the 
21- Euclidean inner product. 



Use Exercises 20 and 21 to show that an k x « matrix is unitary if and only if its columns form an orthonormal set in c M 
22. w ith the Euclidean inner product. 

Let A and u be distinct eigenvalues of a Hermitian matrix A- 
23. 

(a) Prove that if x is an eigenvector corresponding to \ and y an eigenvector corresponding to ^, then x Ay = Ax y and 
x Ay = fuc y. 



(b) Prove Theorem 10.6.4. 
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Chapter 10 



Supplementary Exercises 



1. 



Letu= (uuu 2f ...,u n ) and v = (vi, v 2 , ..., v„) be vectors in C", and let fi = (u h z7 2 , ..., S„) andy = (vi, v 2? ..., v„)- 



(a) Prove: u - v = u - v. 



(b) Prove: u and v are orthogonal if and only if u and v are orthogonal. 



Show that if the matrix 

a b 
— b a 

is nonzero, then it is invertible. 



3. 



Find a basis for the solution space of the system 



-1 


— i 


f 


~*l" 




"0" 


— i 


1 


j 


*2 


= 





1 


i 


-1 


*3 








4. 



Prove: If fl and £ are complex numbers such that \ a i 2 | i£ i 2 — ] , and if is a real number, then 



A = 



a b 

-e i& b e i6 a 



is a unitary matrix. 



Find the eigenvalues of the matrix 







1 



1 o ^+1+^ 



1 _^,_i_. 



^' 



where ^ = e 2 ™? 3 . 



6. 



Consider the relation between the complex number z = a + ib and the corresponding 2x2 matrix with real entries 



Z = 



a — b 
b a 



(a) How are the eigenvalues of Z related to z ? 



(b) How is the complex number z related to the determinant of Z? 



(c) Show that z _1 corresponds to z 



-1 



7. 



(d) Show that the product (a -I bi) (c I di) corresponds to the matrix that is the product of the matrices corresponding 
to a I bjznd c -\-di- 



(a) Prove that if z is a complex number other than 1, then 



l+z + z 2 +- + z" = 



l-z ; 



M + l 



1-z 



Hint Let s be the sum on the left side of the equation and consider the quantity S-zS. 
(b) Use the result in part (a) to prove that if z n = 1 and z * 1 , then ] _|_ z + z 2 + - + z H_1 = • 



(c) Use the result in part (a) to obtain Lagrange's trigonometric identity 



-. sin 
1 + cos 8 + cos 28 -\ h cos n8 = — + 



r<«-4> g " 



2 2sin(0/2) 



for < B < 2ir- 

Hint Let z = cos 9 + i sin 8- 



8. 



Let w = e 2 ™ 73 . Show that the vectors Vl = (1 / ^3) (l p l p 1), v 2 = (1 / jf3)(l, w, w 2 ) and v 3 = (1 / ^3) (1, J, lj A ) form 



an orthonormal set in Q^- 
Hint Use part (b) of Exercise 7. 



9. 



Show that if jj is an M x « unitary matrix and i^ i = i^l = — = IzJ = 1, then the product 



10. 



u 



is also unitary. 
Suppose that A* = —A- 

(a) Show that iA is Hermitian. 



z\ - 
z 2 - 

- Zy, 



(b) Show that A is unitarily diagonalizable and has pure imaginary eigenvalues. 



11. 



Show that the eigenvalues of a unitary matrix have modulus 1. 



12. 



Under what conditions is the following matrix normal? 



A = 



a 





0" 








c 





b 






13. 



Show that if u is a nonzero vector in C" that is expressed in column form, then p — uu * is Hermitian and has rank 1. 



Show that if u is a unit vector in C" that is expressed in column form, then H = 1 — 2uu * i s unitary and Hermitian. This 
14, is called a Householder matrix. 

What geometric interpretations might you reasonably give to multiplication by the matrices p — uu * and H = 1 — 2uu * 
"• in Exercises 13 and 14? 
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Chapter 1 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Sections 10.1 and 10.2 

Tl. (Complex Numbers and Numerical Operations) Read your documentation on entering and displaying complex numbers 
and for performing the basic arithmetic operations of addition, subtraction, multiplication, and division. Experiment with 
numbers of your own choosing until you feel you have mastered the operations. 



T2. (Matrices with Complex Entries) For most technology utilities the procedures for adding, subtracting, multiplying, and 
inverting matrices with complex entries are the same as for matrices with real entries. Experiment with these operations on 
some matrices of your own choosing, and then try using your utility to solve some of the exercises in Sections 10.1 and 10.2. 



T3. (Complex Conjugate) Read your documentation on finding the conjugate of a complex number, and then use your utility to 
perform the computations in Example 1 of Section 10.2. 

Section 10.3 

Tl. (Modulus and Argument) Read your documentation on finding the modulus and argument of a complex number, and then 
use your utility to perform the computations in Example 1 . 

Section 10.6 

Tl. (Conjugate Transpose) Read your documentation on finding the conjugate transpose of a matrix with complex entries, and 
then use your utility to perform the computations in Examples Example 1 and Example 3. 



T2. (Unitary Diagonalization) Use your technology utility to diagonalize the matrix ^4 in Example 5 and to find a matrix p that 
unitarily diagonalizes A- (See Technology Exercise Tl of Section 7.2.) 
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11 



CHAPTER 



Applications of Linear Algebra 



INTRODUCTION: This chapter consists of 21 applications of linear algebra. With one clearly marked exception, each 
application is in its own independent section, so sections can be deleted or permuted as desired. Each topic begins with a list 
of linear algebra prerequisites. 

Because our primary objective in this chapter is to present applications of linear algebra, proofs are often omitted. Whenever 
results from other fields are needed, they are stated precisely, with motivation where possible, but usually without proof. 
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11.1 

i D \/pc ln thl ' s sectl ' on we describe a technique that uses determinants to construct 

LUIMb I KUL I 1IM(j LUKVbb u neSj circles, and general conic sections through specified points in the plane. 

AND SU R FACES The procedure is also used to pass planes and spheres in 3-space through 

THROUGH SPECIFIED fixedpoints. 
POINTS 



Prerequisites: Linear Systems 
Determinants 



Analytic Geometry 



The following theorem follows from Theorem 2.3.6. 



THEOREM 11.1.1 



A homogeneous linear system with as many equations as unknowns has a nontrivial solution if and only if the determinant of 
the coefficient matrix is zero. 



We shall now show how this result can be used to determine equations of various curves and surfaces through specified points. 



A Line through Two Points 

Suppose that (x\, y\) and (^ 72) are two distinct points in the plane. There exists a unique line 

c\x I c-jy -f c^ = 



(1) 



that passes through these two points (Figure 11.1.1). Note that c\, C2> an d ^3 are not a U zero an d that these coefficients are unique 
only up to a multiplicative constant. Because { X \ 9 y\) an d (x2, 72) ^ e on ^ e ^ ne ' substituting them in 1 gives the two equations 



C]X] +C2Y] +c^ = 



(2) 



^1^2 + ^2X2 I ^3 = ° 



(3) 




Figure 11.1.1 

The three equations, 1, 2, and 3, can be grouped together and rewritten as 



x\c\ +71^2+^3 = 

*2^1 +72^2+^3 = 

which is a homogeneous linear system of three equations for c \ , C2» an d ^3- Because c \ , q, and c^ are not all zero, this system has 
a nontrivial solution, so the determinant of the system must be zero. That is, 



X 


y 


l 


*1 


y\ 


l 


*2 


yi 


l 



= 



(4) 



Consequently, every point (x, y) on the line satisfies 4; conversely, it can be shown that every point (x, y) that satisfies 4 lies on 
the line. 



EXAMPLE 1 Equation of a Line 



Find the equation of the line that passes through the two points (2, 1) and (3,7). 



Solution 

Substituting the coordinates of the two points into Equation 4 gives 

x y 1 

2 1 1 

3 7 1 



= 



The cofactor expansion of this determinant along the first row then gives 

- 6x + y + 1 1 = 

A Circle through Three Points 

Suppose that there are three distinct points in the plane, (jq, y^), (x 2 , 72)' anc * (*3> 73)' not a ^ lyi n g on a straight line. From 
analytic geometry we know that there is a unique circle, say, 

ciO 1 y ) 1 c 2 x 1 c^y-\ c 4 = (5) 

that passes through them (Figure 11.1.2). Substituting the coordinates of the three points into this equation gives 

ci(*i 1 yf) 1 c 2*i 1 C3yi 1 c 4 = o ( 6 ) 



ci(*2 I 72) I c 2*2 I ^2 I c 4 = 



^1 (*| +7?) + c 2*3+ ^3 + ^4 = 



(7) 
(8) 




Figure 11.1.2 

As before, Equations 5 through 8 form a homogeneous linear system with a nontrivial solution for cj, ci, cj, and c$. Thus the 
determinant of the coefficient matrix is zero: 

* 2 + y 2 * y 1 



* 2 +y 2 *i y\ 1 
* 2 +y 2 x 2 yi 1 

xl+yl *3 y 3 1 



= 



(9) 



This is a determinant form for the equation of the circle. 



EXAMPLE 2 Equation of a Circle 



Find the equation of the circle that passes through the three points (1,7), (6, 2), and (4, 6). 



Solution 



Substituting the coordinates of the three points into Equation 9 gives 

* 2 + y 2 * y 1 

50 17 1 

40 6 2 1 

52 4 6 1 



= 



which reduces to 



In standard form this is 



10(x 2 I y 2 )- 20* -40^-200 = 



(x-1) 2 | (y-2) 2 = 5 2 



Thus the circle has center (1,2) and radius 5. 



A General Conic Section through Five Points 



The general equation of a conic section in the plane (a parabola, hyperbola, or ellipse, or degenerate forms of these curves) is given 
by ^ 

2 2 

This equation contains six coefficients, but we can reduce the number to five if we divide through by any one of them that is not 
zero. Thus only five coefficients must be determined, so five distinct points in the plane are sufficient to determine the equation of 
the conic section (Figure 1 1.1.3). As before, the equation can be put in determinant form (see Exercise 7): 



xy y 



y i 



*2 ^2 yj x 2 72 1 

*3 ^373 73 *3 73 1 

*4 *4)M 74 *4 74 1 

*j x&s yi *5 y$ l 



= 



(10) 



ir 




^ia) 



toitjv 






Figure 11.1.3 



EXAMPLE 3 Equation of an Orbit 



An astronomer who wants to determine the orbit of an asteroid about the sun sets up a Cartesian coordinate system in the plane of 
the orbit with the sun at the origin. Astronomical units of measurement are used along the axes (1 astronomical unit = mean 
distance of earth to sun = 93 million miles). By Kepler's first law, the orbit must be an ellipse, so the astronomer makes five 
observations of the asteroid at five different times and finds five points along the orbit to be 

(8.025, 8.310), (10.170, 6.355), (11.202, 3.212), (10.736, 0.375), (9.092, -2.267) 

Find the equation of the orbit. 



Solution 



Substituting the coordinates of the five given points into 10 gives 



X 


*y 


y 


X 


y 


64.401 


66.688 


69.056 


8.025 


8.310 


103.429 


64.630 


40.386 


10.170 


6.355 


125.485 


35.981 


10.317 


11.202 


3.212 


115.262 


4.026 


0.141 


10.736 


0.375 


82.664 


-20.612 


5.139 


9.092 


-2.267 



= 



The cofactor expansion of this determinant along the first row is 

3B6.799* 2 - 102.896jfy I 446.026y 2 - 2476.409* - 1427.971y - 17109.378 = 
Figure 1 1.1.4 is an accurate diagram of the orbit, together with the five given points. 

i I OJ 70, 6.355* 




(11.202, 3.212) 
(10.73*. (U75) 



(9.092. -2.267) 



-A-4-2 n 7 4 ft H in \7 14 IA lft?ft-92 



Figure 11.1.4 



A Plane through Three Points 

In Exercise 8 we ask the reader to show the following: The plane in 3 -space with equation 

C]X + c?y + cyz + C£ = 
that passes through three noncollinear points (^ 1? y± 9 z\)> (x2, 72> z 2)> anc * Or 3, 73, 23) is given by the determinant equation 

x y z \ 
*1 71 z\ 1 
*2 72 ^2 1 
*3 73 ^3 1 



= 



(ID 



EXAMPLE 4 Equation of a Plane 



The equation of the plane that passes through the three noncollinear points (1, 1,0), (2, 0,-1), and (2, 9, 2) is 

x 7 z 1 



11 1 
2 0-11 
2 9 2 1 



= 



which reduces to 



2* -7 + 3z- 1 = 



A Sphere through Four Points 

In Exercise 9 we ask the reader to show the following: The sphere in 3 -space with equation 

2 2 2 

C](x +y +z ) +ox + c%y + C4Z + c* } = 

that passes through four noncoplanar points fr u yu Z{ ), (* 2? y 2? z 2 )> (x 3 , 73, z 3 ), and ( X4? y4? Z4 ) is given by the following 
determinant equation: 



2 , 2,2 
x -\-y -\-z x y z 



1 



*?+^?+ z ? *1 ^1 z l ] 

*2+^2+ z 2 x 2 72 ^2 1 

^|+73+ z 3 *3 73 ^3 1 

*4 I 74 I ^4 *4 74 ^4 1 



= 



(12) 



EXAMPLE 5 Equation of a Sphere 



The equation of the sphere that passes through the four points (0, 3, 2), (1, -1, 1), (2, 1, 0), and (5, 1, 3) is 



This reduces to 



which in standard form is 



2 2 

+ y 


\-z 2 X 


y z 1 


13 





3 2 1 


3 


1 


-1 1 1 


5 


2 


1 1 


35 


5 


1 3 1 



= 



x 2 +y 2 +z 2 -Ax - 2y - 6z 4- 5 = 



(x-2) 2 I (y-1) 2 I (z-3) 2 = 9 



Exercise Set 11.1 



&■ 



Click here for Just Ask! 



Find the equations of the lines that pass through the following points: 



(a) (1,-1), (2, 2) 



(b) (0, 1), (1, -1) 



Find the equations of the circles that pass through the following points: 



(a) (2, 6), (2, 0), (5, 3) 



(b) (2, -2), (3, 5), (-4, 6) 



Find the equation of the conic section that passes through the points (0, 0), (0, -1), (2, 0), (2, -5), and (4, -1). 



Find the equations of the planes in 3-space that pass through the following points: 



(a) (1, 1,-3), (1,-1, 1), (0,-1, 2) 



(b) (2,3, 1), (2, -1,-1), (1,2, 1) 



5. 



(a) Alter Equation 1 1 so that it determines the plane that passes through the origin and is parallel to the plane that passes 
through three specified noncollinear points. 



(b) Find the two planes described in part (a) corresponding to the triplets of points in Exercises 4(a) and 4(b). 

Find the equations of the spheres in 3-space that pass through the following points: 
6. 

(a) (1, 2, 3), (-1, 2,1), (1,0, 1), (1,2,-1) 

(b) (0, 1,-2), (1,3, 1), (2, -1,0), (3, 1,-1) 

Show that Equation 10 is the equation of the conic section that passes through five given distinct points in the plane. 
7. 

Show that Equation 1 1 is the equation of the plane in 3-space that passes through three given noncollinear points. 
8. 

Show that Equation 12 is the equation of the sphere in 3-space that passes through four given noncoplanar points. 
9. 

Find a determinant equation for the parabola of the form 
10. 2 

that passes through three given noncollinear points in the plane. 

What does Equation 9 become if the three distinct points are collinear? 
11. 

What does Equation 1 1 become if the three distinct points are collinear? 
12. 

What does Equation 12 become if the four points are coplanar? 
13. 



Section 11.1 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematical Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



The general equation of a quadric surface is given by 

2 2 2 

a\x -\-a^y -\-a^ -\-a$x) 

Given nine points on this surface, it may be possible to determine its equation. 



11. 2 2 2 

a\x -\-a^y -\-a^ + a$xy + a$xz I aQ?z-\-a-jx H-ft&y H-ftpz + ftig = 



(a) Show that if the nine points (^ 3? y^) for j = l, 2, 3, . . ., 9 lie on this surface, and if they determine uniquely the 
equation of this surface, then its equation can be written in determinant form as 
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= 



(b) Use the result in part (a) to determine the equation of the quadric surface that passes through the points (1,2, 3), (2, 1, 
7), (0, 4, 6), (3, -1, 4), (3, 0, 11), (-1, 5, 8),(9, -8, 3), (4, 5, 3), and (-2, 6, 10). 



(c) Use the methods of Section 9.7 to identify the resulting surface in part (b). 



T2. 



(a) A hyperplane in the ^-dimensional Euclidean space R n has an equation of the form 

where a ir i = \, 2, 3, ..., n i 1, are constants, not all zero, and x u i = l, 2, 3, ..., n, are variable 



for which 

A point 

lies on this hyperplane if 



(*10, *20> *30> — - *h0) ei?" 



^1*10 + ^2*20 I ^3*30 l-- + a H ^H0 + ^H+l = ° 

Given that the n points (^ 1]? * 2]? * 3] -, ..., x ni ), i = 1, 2, 3, ..., n, lie on this hyperplane and that th 
uniquely determine the equation of the hyperplane, showthat the equation of the hyperplan 
can be written in determinant form as 
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(b) Determine the equation of the hyperplane in p$ that goes through the following nine points: 



(1,2,3,4,5,6,7,8,9) (2,3,4,5,6,7,8,9,1) (3,4,5,6,7,8,9,1,2) 
(4,5,6,7,8,9,1,2,3) (5,6,7,8,9,1,2,3,4) (6,7,8,9,1,2,3,4,5) 
(7,8,9,1,2,3,4,5,6) (3,9,1,2,3,4,5,6,7) (9,1,2,3,4,5,6,7,8) 
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In this section basic laws of electrical circuits are discussed, and it is shown how 
these laws can be used to obtain systems of linear equations whose solutions 
E LECTRICAL N ETWO RKS yield the currents flowing in an electrical circuit. 



11.2 



Prerequisites: Linear Systems 
The simplest electrical circuits consist of two basic components: 

electrical sources denoted by ^ 



resistors 



denoted by 



^vww 



Electrical sources, such as batteries, create currents in an electrical circuit. Resistors, such as lightbulbs, limit the magnitudes of the 
currents. 



There are three basic quantities associated with electrical circuits: electrical potential (E), resistance (R), and current (I). These 
are commonly measured in the following units: 



E in volts 



(V) 



R in ohms 



(Q) 



/ in amperes (A) 

Electrical potential is associated with two points in an electrical circuit and is measured in practice by connecting those points to a 
device called a voltmeter. For example, a commonAA battery is rated at 1.5 volts, which means that this is the electrical potential 
across its positive and negative terminals (Figure 11.2.1). 

VtiUmeier 




I 



Figure 11.2.1 

In an electrical circuit the electrical potential between two points is called the voltage drop between these points. As we shall see, 
currents and voltage drops can be either positive or negative. 

The flow of current in an electrical circuit is governed by three basic principles: 

1. Ohm 's Law. The voltage drop across a resistor is the product of the current passing through it and its resistance; that is, 

E = IR 

2. Kirchhoff's Current Law. The sum of the currents flowing into any point equals the sum of the currents flowing out from 
the point. 



3. Kirchhoff's Voltage Law. Around any closed loop, the algebraic sum of the voltage drops is zero. 



EXAMPLE 1 Finding Currents in a Circuit 



Find the unknown currents /j, / 2 , an d ^3 i n the circuit shown in Figure 1 1.2.2. 



Ouicr 
Juup 




Figure 11.2.2 



Solution 

The flow directions for the currents / 1? / 2 , an d ^3 (marked by the arrowheads) were picked arbitrarily. Any of these currents that 
turn out to be negative actually flow opposite to the direction selected. 

Applying Kirchhoff 's current law to points A and B yields 

I l= I 2 + I 3 (Points) 
I 3 -\-I 2 = h (Point B) 

Since these equations both simplify to the same linear equation 

/ 1 -/ 2 -/ 3 = 



(1) 
we still need two more equations to determine / 1? / 2 , and / 3 uniquely. We will obtain them using Kirchhoff ? s voltage law. 

To apply Kirchhoff 's voltage law to a loop, select a positive direction around the loop (say clockwise) and make the following 
sign conventions: 

* 
A current passing through a resistor produces a positive voltage drop if it flows in the positive direction of the loop and a 
negative voltage drop if it flows in the negative direction of the loop. 

* 
A current passing through an electrical source produces a positive voltage drop if the positive direction of the loop is from + 
to - and a negative voltage drop if the positive direction of the loop is from - to +. 



Applying Kirchhoff s voltage law and Ohm's law to loop 1 in Figure 1 1.2.2 yields 

7/i I 3/3-30 = 



and applying them to loop 2 yields 



ll/ 2 — 3/ 3 — 50 = 



(2) 



(3) 



Combining 1, 2, and 3 yields the linear system 

h- h-h= 

7/i + 3/ 3 = 30 

11/2-3/3 = 50 

Solving this linear system yields the following values for the currents: 

/i=^ca), / 2 =m CA)p / 3= _^_ (A) 

Note that / 3 is negative, which means that this current flows opposite to the direction indicated in Figure 1 1.2.2. Also note that we 
could have applied Kirchhoff ? s voltage law to the outer loop of the circuit. However, this produces a redundant equation (try it). 



Exercise Set 1 1 .2 



& 



Click here for Just Ask! 



In Exercises 1-4 find the currents in the circuits. 
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Show that if the current j 5 in the circuit of the accompanying figure is zero, then ^ 4 — R^R 2 / Ry 




Figure Ex-5 



Remark This circuit, called a Wheatstone bridge circuit, is used for the precise measurement of resistance. Here, ^ 4 is an 
unknown resistance and R^, R 2 , and R^ are adjustable calibrated resistors. R^ represents a galvanometer — a device for 
measuring current. After the operator varies the resistances R±, R 2 , and R^ until the galvanometer reading is zero, the formula 
R4 = R3R2 I R\ determines the unknown resistance R^. 



R = 



6. Show that if the two currents labeled / in the circuits of the accompanying figure are equal, then " I | 



Ri 



R^ 




ft, !■ 



Figure Ex-6 



Section 1 1 .2 



® 



Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



The accompanying figure shows a sequence of different circuits. 



(a) Solve for the current / 1? for the circuit in part (a) of the figure. 



(b) Solve for the currents J± through / 3 , for the circuit in part (b) of the figure. 



(c) Solve for the currents J± through l^ for the circuit in part (c) of the figure. 



(d) Continue this process until you discover a pattern in the values of /j, / 2 > 1^ 



(e) Investigate the sequence of values for / 1 in each of the circuits in parts (a), (b), (c), and so on, and numerically show 
that the limit of this sequence approaches the value 
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Figure Ex-T1 



h > ft 



R 



(c) 



The accompanying figure shows a sequence of different circuits. 



(a) Solve for the current J 1? for the circuit in part (a) of the figure. 



(b) Solve for the current /j, for the circuit in part (b) of the figure. 



(c) Solve for the current / 1? for the circuit in part (c) of the figure. 



(d) Continue this process until you discover a pattern in the values of /j. 



(e) Investigate the sequence of values for / 1 in each of the circuits in parts (a), (b), (c), and so on, and numerically show 
that the limit of this sequence approaches the value 



'4pi)§~<0.6180)§ 
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Figure Ex-T2 
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11 3 

In this section we describe a geometric technique for maximizing or minimizing a 
GEOMETRIC LINEAR linear expression in two variables subject to a set of linear constraints. 

PROGRAMMING 



Prerequisites: Linear Systems 

Linear Inequalities 



Linear Programming 

The study of linear programming theory has expanded greatly since the pioneering work of George Dantzig in the late 1940s. 
Today, linear programming is applied to a wide variety of problems in industry and science. In this section we present a geometric 
approach to the solution of simple linear programming problems. Let us begin with some examples. 



EXAMPLE 1 Maximizing Sales Revenue 



A candy manufacturer has 130 pounds of chocolate-covered cherries and 170 pounds of chocolate-covered mints in stock. He 
decides to sell them in the form of two different mixtures. One mixture will contain half cherries and half mints by weight and will 
sell for $2.00 per pound. The other mixture will contain one-third cherries and two-thirds mints by weight and will sell for $1.25 
per pound. How many pounds of each mixture should the candy manufacturer prepare in order to maximize his sales revenue? 

Solution 

Let us first formulate this problem mathematically. Let the mixture of half cherries and half mints be called mix A, and let x\ be the 
number of pounds of this mixture to be prepared. Let the mixture of one-third cherries and two-thirds mints be called mix 5, and 
let *2 be the number of pounds of this mixture to be prepared. Since mix A sells for $2.00 per pound and mix B sells for $1.25 per 
pound, the total sales z (in dollars) will be 

z = 2.00*i + 1.25*2 



of pounds of cherries used in both mixtures is 



Since each pound of mix A contains ]- pound of cherries and each pound of mix B contains ]- pound of cherries, the total number 



^l I \*2 

Similarly, since each pound of mix A contains i pound of mints and each pound of mix B contains |- pound of mints, the total 
number of pounds of mints used in both mixtures is 

Because the manufacturer can use at most 130 pounds of cherries and 170 pounds of mints, we must have 

^l + §*2<170 

Furthermore, since x\ and *2 cannot be negative numbers, we must have 

*1>0 and *2>0 
The problem can therefore be formulated mathematically as follows: Find values of x\ and *2 that maximize 

z = 2.00*i + 1-25*2 



subject to 

±*l I |x 2 <170 

*2>0 
Later in this section we shall show how to solve this type of mathematical problem geometrically. 



EXAMPLE 2 Maximizing Annual Yield 



A woman has up to $10,000 to invest. Her broker suggests investing in two bonds, A and B. Bond A is a rather risky bond with an 
annual yield of 10%, and bond B is a rather safe bond with an annual yield of 7%. After some consideration, she decides to invest 
at most $6000 in bond A, to invest at least $2000 in bond 5, and to invest at least as much in bond A as in bond B. How should she 
invest her $10,000 in order to maximize her annual yield? 

Solution 

To formulate this problem mathematically, let x\ be the number of dollars to be invested in bond A, and let ^2 be the number of 
dollars to be invested in bond B. Since each dollar invested in bond A earns $.10 per year and each dollar invested in bond B earns 
$.07 per year, the total dollar amount z earned each year by both bonds is 

z = . lO^i +07x 2 

The constraints imposed can be formulated mathematically as follows: 

Invest no more than $10,000: x\ + X2 < 10,000 

Invest at most $6000 in bond A: x \ < 6000 

Invest at least $2000 in bond B: x 2 > 2000 

Invest at least as much in bond A as in bond B: x \ > x 2 

We also have the implicit assumption that x\ and *2 are nonnegative: 

x\ >0 and *2 ^ 
Thus the complete mathematical formulation of the problem is as follows: Find values of x\ and ^2 that maximize 

z = . lO^i +0.7^2 
subject to 

*i+*2< 10,000 

xi < 6000 

*2>2000 
*l -*2>0 

*i>0 

*2>0 



EXAMPLE 3 Minimizing Cost 



A student desires to design a breakfast of corn flakes and milk that is as economical as possible. On the basis of what he eats 
during his other meals, he decides that his breakfast should supply him with at least 9 grams of protein, at least ^ the (RDA) of 

vitamin D, and at least j- the RDA of calcium. He finds the following nutrition information on the milk and corn flakes containers: 





Milk 
(Vi cup) 


Corn Flakes 
(1 ounce) 


Cost 


7.5 cents 


5.0 cents 


Protein 


4 grams 


2 grams 


Vitamin D 


1 of RDA 


-L of RDA 

10 


Calcium 


1 of RDA 

6 


None 



In order not to have his mixture too soggy or too dry, the student decides to limit himself to mixtures that contain 1 to 3 ounces of 
corn flakes per cup of milk, inclusive. What quantities of milk and corn flakes should he use to minimize the cost of his breakfast? 



Solution 



For the mathematical formulation of this problem, let x\ be the quantity of milk used (measured in -^--cup units), and let *2 be the 
quantity of corn flakes used (measured in 1 -ounce units). Then if z is the cost of the breakfast in cents, we may write the following. 
Cost of breakfast: 

At least 9 grams protein: 

At least 1 RDA vitamin D: 

3 

At least -i RDA calcium: 

4 



z — 


7.5xi 


+ 5.0*2 


4*1 


+ 2* 2 


>9 


h 


1 

1 + To 


^ 


h 


^ 





At least 1 ounce corn flakes per cup (two -^--cups) of milk: 



fi>±(or*i-2* 2 <0) 



At most 3 ounces corn flakes per cup (two ^--cups) of milk: ^2 <■• 3. / or 3^ _ 2^ > Q) 

As before, we also have the implicit assumption that x ^ > and ^ > 0- Thus the complete mathematical formulation of the 
problem is as follows: Find values of x\ and ^2 that minimize 



subject to 



z = 1.5xi -\-5.0x 2 



4*i + 2*2 > 9 

xi — 2*2 < 

3*i- 2*2 >0 

*l>0 

*2>0 



Geometric Solution of Linear Programming Problems 

Each of the preceding three examples is a special case of the following problem. 
Problem Find values of x \ and x 2 that either maximize or minimize 

Z = C\X\ -\-C2X2 

(i) 

subject to 

011*1 + fl 12*2 (<)(>)(=) b\ 

«21*1 H fl 22*2 (<)(>)( = ) *2 

: : ; (2) 

0ml*l + m 2*2 (<)(>)(=) bm 
and 

*l>0, *2>0 

In each of the m conditions of 2, any one of the symbols <, >, and = may be used. 

The problem above is called the general linear programming problem in two variables. The linear function z in 1 is called the 
objective function. Equations 2 and 3 are called the constraints; in particular, the equations in 3 are called the nonnegativity 
constraints on the variables x\ and x 2 - 

We shall now show how to solve a linear programming problem in two variables graphically. A pair of values (* 1? ^2) that satisfy 
all of the constraints is called a feasible solution. The set of all feasible solutions determines a subset of the * i^-plane called the 
feasible region. Our desire is to find a feasible solution that maximizes the objective function. Such a solution is called an optimal 
solution. 

To examine the feasible region of a linear programming problem, let us note that each constraint of the form 

0j'1*1 I a i2*2 = bi 
defines a line in the x i^-plane, whereas each constraint of the form 

0j1*1 +0]2*2<*i or 0j1*1 I- 0*2*2 > Ma- 
delines a half-plane that includes its boundary line 

0]l*l I 0]2*2=*i 
Thus the feasible region is always an intersection of finitely many lines and half-planes. For example, the four constraints 

7^*1 + ^*2< 130 

7^*1 + f*2< 170 

*1>0 

*2>0 

of Example 1 define the half-planes illustrated in parts (a), (/?), (c), and (d) of Figure 1 1.3.1. The feasible region of this problem is 
thus the intersection of these four half-planes, which is illustrated in Figure 1 \3.le. 
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Figure 11.3.1 



It can be shown that the feasible region of a linear programming problem has a boundary consisting of a finite number of straight 
line segments. If the feasible region can be enclosed in a sufficiently large circle, it is called bounded (Figure 1 1.3. le); otherwise, 
it is called unbounded (see Figure 1 1.3.5). If the feasible region is empty (contains no points), then the constraints are inconsistent 
and the linear programming problem has no solution (see Figure 1 1.3.6). 

Those boundary points of a feasible region that are intersections of two of the straight line boundary segments are called extreme 
points. (They are also called corner points and vertex points.) For example, in Figure 1 1.3. le, we see that the feasible region of 
Example 1 has four extreme points: 



(0, 0), (0, 255), (ISO, 120), (260, 0) 



(4) 



The importance of the extreme points of a feasible region is shown by the following theorem. 
THEOREM 11.3.1 



Maximum and Minimum Values 

If the feasible region of a linear programming problem is nonempty and bounded, then the objective function attains both a 
maximum and a minimum value, and these occur at extreme points of the feasible region. If the feasible region is unbounded, 
then the objective function may or may not attain a maximum or minimum value; however, if it attains a maximum or minimum 
value, it does so at an extreme point. 



Figure 1 1.3.2 suggests the idea behind the proof of this theorem. Since the objective function 

z = c\x\ +C2*2 

of a linear programming problem is a linear function of x \ and xj, its level curves (the curves along which z has constant values) 
are straight lines. As we move in a direction perpendicular to these level curves, the objective function either increases or decreases 
monotonically. Within a bounded feasible region, the maximum and minimum values of z must therefore occur at extreme points, 
as Figure 11.3.2 indicates. 



g famxafas 




Level curves 



Figure 11.3.2 

In the next few examples we use Theorem 1 1.3.1 to solve several linear programming problems and illustrate the variations in the 
nature of the solutions that may occur. 



EXAMPLE 4 Example 1 Revisited 



Figure 1 1.3. 1^ shows that the feasible region of Example 1 is bounded. Consequently, from Theorem 11.3.1 the objective function 

z = 2.00^1 + 1.25^2 

attains both its minimum and maximum values at extreme points. The four extreme points and the corresponding values of z are 
given in the following table. 



Extreme Point Value of 



(0,0) 



(0, 255) 



(180, 120) 



(260, 0) 







318.75 



510.00 



520.00 



We see that the largest value of z is 520.00 and the corresponding optimal solution is (260, 0). Thus the candy manufacturer attains 
maximum sales of $520 when he produces 260 pounds of mixture A and none of mixture B. 



EXAMPLE 5 Using Theorem 1 1 .3.1 



Find values of x i and x 2 that maximize 



subject to 



z = x\ 4- 3*2 



2xi + 3*2 < 24 

*l-x 2 <7 
x 2 <6 

xi>0 

x 2 >0 



Solution 

In Figure 1 1 .3.3 we have drawn the feasible region of this problem. Since it is bounded, the maximum value of z is attained at one 
of the five extreme points. The values of the objective function at the five extreme points are given in the following table. 



,v, = 6 




10,0) 

Figure 11.3.3 



Extreme Point 


Value of 


(*i. *a) 


z = x\-\- It.2 


(0,6) 


18 


(3,6) 


21 


(9,2) 


15 


(7,0) 


7 


(0,0) 






From this table, the maximum value of z is 21, which is attained at jq — 3 and X2 == g. 



EXAMPLE 6 Using Theorem 1 1 .3.1 



Find values of x\ and ^2 that maximize 



subject to 



z = 4*i + 6*2 



2xi + 3x 2 < 24 

*l-x 2 <7 
x 2 <6 

xi>0 

x 2 >0 



Solution 

The constraints in this problem are identical to the constraints in Example 5, so the feasible region of this problem is also given by 
Figure 1 1.3.3. The values of the objective function at the extreme points are given in the following table. 

Extreme Point Value of 

(xi, x 2 ) z = 4xi + 6x 2 



(0,6) 



(3,6) 



(9,2) 



(7,0) 



(0,0) 



36 



48 



48 



28 







We see that the objective function attains a maximum value of 48 at two adjacent extreme points, (3, 6) and (9, 2). This shows that 
an optimal solution to a linear programming problem need not be unique. As we ask the reader to show in Exercise 10, if the 
objective function has the same value at two adjacent extreme points, it has the same value at all points on the straight line 
boundary segment connecting the two extreme points. Thus, in this example the maximum value of z is attained at all points on the 
straight line segment connecting the extreme points (3, 6) and (9, 2). 



EXAMPLE 7 The Feasible Region Is a Line Segment 



Find values of x j and x 2 that minimize 



subject to 



z = 2*1 — x 2 

2*1 + 3*2 =12 

2xi- 3x 2 > 

xi > 

x 2 > 



Solution 

In Figure 11.3.4 we have drawn the feasible region of this problem. Because one of the constraints is an equality constraint, the 
feasible region is a straight line segment with two extreme points. The values of z at the two extreme points are given in the 
following table. 



2x. - 3kfc = 




Figure 11.3.4 
Extreme Point Value of 



The minimum value of z is thus 4 and is attained at x 1 — 3 and X2 = 2 



(3,2) 



(6,0) 



4 



12 



EXAMPLE 8 Using Theorem 1 1 .3.1 



Find values of x 1 and x 2 that maximize 



subject to 



z — 2x\ -4 5^2 

2x { +x 2 >8 

-4x\ + *2<2 

2^i- 3^2 <0 

*i>0 

*2>0 



Solution 

The feasible region of this linear programming problem is illustrated in Figure 1 1.3.5. Since it is unbounded, we are not assured by 
Theorem 1 1.3.1 that the objective function attains a maximum value. In fact, it is easily seen that since the feasible region contains 
points for which both x\ and ^2 are arbitrarily large and positive, the objective function 

z = 2x\ + 5x 2 

can be made arbitrarily large and positive. This problem has no optimal solution. Instead, we say the problem has an unbounded 
solution. 



-4*1 + jjf =t 



Figure 11.3.5 




EXAMPLE 9 Using Theorem 1 1 .3.1 



Find values of x \ and x 2 that maximize 



subject to 



z= — 5*i 4- *2 

2*i+ *2>8 
-4*i 4- *2<2 

2*i- 3*2 <0 
*l > 

*2> 



Solution 

The above constraints are the same as those in Example 8, so the feasible region of this problem is also given by Figure 1 1.3.5. In 
Exercise 1 1 we ask the reader to show that the objective function of this problem attains a maximum within the feasible region. By 
Theorem 1 1.3.1, this maximum must be attained at an extreme point. The values of z at the two extreme points of the feasible 
region are given in the following table. 



(3, 2) -13 

The maximum value of z is thus 1 and is attained at the extreme point x ^ = 1? *2 



Extreme Point Value of 

(*1, * 2 ) z = " 5 *1 I *2 



(1,6) 



1 



EXAMPLE 10 Inconsistent Constraints 



Find values of x\ and X2 that minimize 



z = 3*i — 8*2 



subject to 



2*1- *2< 4 

3*i+ 4x 2 > 24 
*1> 
*2> 



So/ut/or? 

As can be seen from Figure 1 1.3.6, the intersection of the five half-planes defined by the five constraints is empty. This linear 
programming problem has no feasible solutions since the constraints are inconsistent. 




Figure 11.3.6 



There are no points common to all five shaded half-planes. 



Exercise Set 1 1 .3 



O 



Click here for Just Ask! 



Find values of x\ and %2 that maximize 



subject to 



z = 3x\ -H 2^2 

2*1 + 3*2 ^ & 
2*1 — %2 z- 

*i<2 

*2>0 



2. 



Find values of x i and *2 that minimize 

z = 3x\ — 5*2 
subject to 



3. 



4. 



5. 



8. 



2*1 — *2 < — 2 
4*i -*2> 

*2< 3 

*1> 

*2> 

Find values of *i and *2 that minimize 

z= —3*i-| 2*2 

subject to 

3*i -* 2 > -5 

-*l+*2> 1 

2*1 +4*2 > 12 

*l> 

*2> 

Solve the linear programming problem posed in Example 2. 



Solve the linear programming problem posed in Example 3. 



In Example 5 the constraint x ^ _ X2 < 7 * s sa id t0 he nonbinding because it can be removed from the problem without affecting 
"• the solution. Likewise, the constraint K2 < 6 is said to be binding because removing it will change the solution. 

(a) Which of the remaining constraints are nonbinding and which are binding? 

(b) For what values of the righthand side of the nonbinding constraint x ^ _ X2 < 7 will this constraint become binding? For 
what values will the resulting feasible set be empty? 

(c) For what values of the righthand side of the binding constraints * 2 < 6 will this constraint become nonbinding? For 
what values will the resulting feasible set be empty? 



A trucking firm ships the containers of two companies, A and B. Each container from company A weighs 40 pounds and is 2 
cubic feet in volume. Each container from company B weighs 50 pounds and is 3 cubic feet in volume. The trucking firm 
charges company A $2.20 for each container shipped and charges company B $3.00 for each container shipped. If one of the 
firm's trucks cannot carry more than 37,000 pounds and cannot hold more than 2000 cubic feet, how many containers from 
companies A and B should a truck carry to maximize the shipping charges? 



Repeat Exercise 7 if the trucking firm raises its price for shipping a container from company A to $2.50. 



A manufacturer produces sacks of chicken feed from two ingredients, A and B. Each sack is to contain at least 10 ounces of 
nutrient 7^, at least 8 ounces of nutrient ]\f 2 , and at least 12 ounces of nutrient ]\fy Each pound of ingredient A contains 2 
ounces of nutrient 7^, 2 ounces of nutrient j\[ 2 , and 6 ounces of nutrient 7^. Each pound of ingredient B contains 5 ounces of 
nutrient 7^, 3 ounces of nutrient j\[ 2 , and 4 ounces of nutrient 7^. If ingredient A costs 8 cents per pound and ingredient B 
costs 9 cents per pound, how much of each ingredient should the manufacturer use in each sack of feed to minimize his costs? 



If the objective function of a linear programming problem has the same value at two adjacent extreme points, show that it has 
10. the same value at all points on the straight line segment connecting the two extreme points. 

Hint If { x * xi) and (W ^"2) are an Y two points in the plane, a point (j 1? ^ 2 ) li es on ^e straight line segment connecting 
them if 

and 

x 2 =tx f 2 + (l-t)x ff 2 

where t is a number in the interval [0, 1]. 

Show that the objective function in Example 10 attains a maximum value in the feasible set. 
11. 

Hint Examine the level curves of the objective function. 



Section 1 1 .3 



® 



Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Matheniatica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Consider the feasible region consisting of < x, < y along with the set of inequalities 
Tl. 






for £ = 0> 1> 2, . . ., h _ l. Maximize the objective function 

z = 3x + 4y 

assuming that (a) n = 1, (b) n = 2, (c) n = 3, (d) n = 4, (e) n = 5, (f) n = 6, (g) n = 7, (h) n = 8, (i) n = 9, 0) n = 10, and (k) 
n = 1 1. (1) Next, maximize this objective function using the nonlinear feasible region, < x, < y, and 

(m) Let the results of parts (a) through (k) begin a sequence of values for z max . Do these values approach the value determined 
in part (1)? Explain. 

Repeat Exercise Tl using the objective function z = x + y. 
T2. 
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11.4 

Linear systems can be found in the earliest writings of many ancient 
TH E EARLI EST civilizations. We give some examples of the types of problems that they used to 

APPLICATIONS OF solve. 

LINEAR ALGEBRA 



Prerequisites: Linear Systems 



The practical problems of early civilizations included the measurement of land, the distribution of goods, the tracking of resources 
such as wheat and cattle, and taxation and inheritance calculations. In many cases, these problems led to linear systems of 
equations since linearity is one of the simplest relationship that can exist among variables. In this section we present examples 
from five diverse ancient cultures illustrating how they used and solved systems of linear equations. We restrict ourselves to 
examples before A.D. 500. These examples consequently predate the development, by Islamic/Arab mathematicians, of the field of 
algebra, a field that ultimately led, in the nineteenth century, to the branch of mathematics now called linear algebra. 



EXAMPLE 1 Egypt (about 1 650 B.C.) 



I... a. .1 ]'. T^Vi x^m^tft ""73 

Problem 40 of the Ahmes Papyrus 



The Ahmes (or Rhind) Papyrus is the source of most of our information about ancient Egyptian mathematics. This 5 -meter-long 
papyrus contains 84 short mathematical problems, together with their solutions, and dates from about 1650 B.C. Problem 40 in this 
papyrus is the following: 



Divide 100 hekats of barley among five men in arithmetic progression so that the sum of the two smallest is one-seventh the 
sum of the three largest. 



Let a be the least amount that any man obtains, and let d be the common difference of the terms in the arithmetic progression. Then 
the other four men receive a \ d, a \ 2d, a \ 3d, and a I 4^ hekats. The two conditions of the problem require that 

a + (a + d) + (a + 2d) + (a 4- 3d) + (a + 4d) = 1 00 

h(a + 2d) + (a + 3d) + (a +4d) =a + (a + d) 

These equations reduce to the following system of two equations in two unknowns: 

5^ + 10^ = 100 

Ua-2d = W 

The solution technique described in the papyrus is known as the method of false position or false assumption. It begins by 
assuming some convenient value of a (in our case a = 1), substituting that value into the second equation, and obtaining d = ll f2- 
Substituting a — \ and d = \\ i 2 into the left-hand side of the first equation gives 60, whereas the right-hand side is 100. 



Adjusting the initial guess for a by multiplying it by 100/60 leads to the correct value 3 = 5/3- Substituting a = 5 / 3 into the 
second equation then gives d = 55 / 6, so the quantities of barley received by the five men are 10/6, 65/6, 120/6, 175/6, and 230/6 
hekats. This technique of guessing a value of an unknown and later adjusting it has been used by many cultures throughout the 
ages. 



EXAMPLE 2 Babylonia (1 900-1 600 B.C.) 



:**~ 







zm& 



Babylonian Clay Tablet Ca MLA 1950 



The Old Babylonian Empire flourished in Mesopotamia between 1900 and 1600 B.C. Many clay tablets containing mathematical 
tables and problems survive from that period, one of which (designated Ca MLA 1950) contains the next problem. The statement 
of the problem is a bit muddled because of the condition of the tablet, but the diagram and the solution on the tablet indicate that 
the problem is as follows: 



30 



m 



Area = 32n 



A trapezoid with an area of 320 square units is cut off from a right triangle by a line parallel to one of its sides. The other side 
has length 50 units, and the height of the trapezoid is 20 units. What are the upper and the lower widths of the trapezoid? 



Let x be the lower width of the trapezoid and y its upper width. The area of the trapezoid is its height times its average width, so 
20 [ ) = 320. Using similar triangles, we also have ^r = t^t. The solution on the tablet uses these relations to generate the 

linear system 



Adding and subtracting these two equations then gives the solution x = 20 and y = 12- 



(2) 



EXAMPLE 3 China (a.d. 263) 



J^-^r^Ut 



Chiu Chang Suan Shu in Chinese characters 



The most important treatise in the history of Chinese mathematics is the Chiu Chang Suan Shu, or "The Nine Chapters of the 
Mathematical Art." This treatise, which is a collection of 246 problems and their solutions, was assembled in its final form by Liu 
Hui in A.D. 263. Its contents, however, go back to at least the beginning of the Han dynasty in the second century B.C. The eighth of 
its nine chapters, entitled "The Way of Calculating by Arrays," contains 18 word problems that lead to linear systems in three to 
six unknowns. The general solution procedure described is almost identical to the Gaussian elimination technique developed in 
Europe in the nineteenth century by Carl Friedrich Gauss (see page 13 for his biography). The first problem in the eighth chapter is 
the following: 



There are three classes of corn, of which three bundles of the first class , two of the second, and one of the third make 39 
measures. Two of the first, three of the second, and one of the third make 34 measures. And one of the first, two of the second, 
and three of the third make 26 measures. How many measures of grain are contained in one bundle of each class? 



Let jc, y, and z be the measures of the first, second, and third classes of corn. Then the conditions of the problem lead to the 
following linear system of three equations in three unknowns: 

3x I 2y+z = 39 

2* I 3^+z = 34 

x + 2^ + 3z = 26 



(3) 



The solution described in the treatise represented the coefficients of each equation by an appropriate number of rods placed within 
squares on a counting table. Positive coefficients were represented by black rods, negative coefficients were represented by red 
rods, and the squares corresponding to zero coefficients were left empty. The counting table was laid out as follows so that the 
coefficients of each equation appear in columns with the first equation in the rightmost column: 




Next the numbers of rods within the squares were adjusted to accomplish the following two steps: 1 two times the numbers of the 
third column were subtracted from three times the numbers in the second column and 2 the numbers in the third column were 
subtracted from three times the numbers in the first column. The result was the following array: 



4 5 2 



1 1 



39 24 39 



In this array, four times the numbers in the second column were subtracted from five times the numbers in the first column, 
yielding 



5 2 
36 1 1 
99 24 39 



This last array is equivalent to the linear system 

3x + 2y+z = 39 
5y-\-z = 24 
36z = 99 
This triangular system was solved by a method equivalent to back-substitution to obtain x = 37 / 4, y — 17 / 4, and z — \ \ / 4. 



EXAMPLE 4 Greece (third century B.C.) 




Archimedes c. 287-212 B.C. 



Perhaps the most famous system of linear equations from antiquity is the one associated with the first part of Archimedes' 
celebrated Cattle Problem. This problem supposedly was posed by Archimedes as a challenge to his colleague Eratosthenes. No 
solution has come down to us from ancient times, so that it is not known how, or even whether, either of these two geometers 
solved it. 



If thou art diligent and wise, O stranger, compute the number of cattle of the Sun, who once upon a time grazed on the fields of 
the Thrinacian isle of Sicily, divided into four herds of different colors, one milk white, another glossy black, a third yellow, and 
the last dappled. In each herd were bulls, mighty in number according to these proportions: Understand, stranger, that the 
white bulls were equal to a half and a third of the black together with the whole of the yellow, while the black were equal to the 
fourth part of the dappled and a fifth, together with, once more, the whole of the yellow. Observe further that the remaining 
bulls, the dappled, were equal to a sixth part of the white and a seventh, together with all of the yellow. These were the 
proportions of the cows: The white were precisely equal to the third part and a fourth of the whole herd of the black; while the 
black were equal to the fourth part once more of the dappled and with it a fifth part, when all, including the bulls, went to 
pasture together. Now the dappled in four parts were equal in number to a fifth part and a sixth of the yellow herd. Finally the 
yellow were in number equal to a sixth part and a seventh of the white herd. If thou canst accurately tell, O stranger, the 
number of cattle of the Sun, giving separately the number of well-fed bulls and again the number of females according to each 
color, thou wouldst not be called unskilled or ignorant of numbers, but not yet shalt thou be numbered among the wise. 



The conventional designation of the eight variables in this problem is 

W = number of white bulls 
B = number of black bulls 
Y= number of yellow bulls 

D = number of dappled bulls 

w = number of white cows 
b = number of black cows 
y = number of yellow cows 

d = number of dappled cows 
The problem can now be stated as the following seven homogeneous equations in eight unknowns: 
1. j^— (1 + 1 )5 + F (The white bulls were equal to a half and a third of the black [bulls] together with the whole of 



l 2 3 



the yellow [bulls].) 



B= (— | — )D \ Y (The black [bulls] were equal to the fourth part of the dappled [bulls] and a fifth, together with, 



^4 5 



once more, the whole of the yellow [bulls].) 



D — {! | !w+ y (The remaining bulls, the dappled, were equal to a sixth part of the white [bulls] and a seventh, 



l 6 7 



together with all of the yellow [bulls].) 



4. w = (— + — )(B \ h) (The white [cows] were precisely equal to the third part and a fourth of the whole herd of the 

3 4 '""" black.) 

5. h = (— + — )(D + d) (The black [cows] were equal to the fourth part once more of the dappled and with it a fifth part, 

4 ^ when all, including the bulls, went to pasture together.) 

6. d = (— I -)(7| y) (The dappled [cows] in four parts [that is, in totality] were equal in number to a fifth part and a 

-* k sixth of the yellow herd.) 

7. y — ( J_ _|_ 1} (iy | w ) (The yellow [cows] were in number equal to a sixth part and a seventh of the white herd.) 
As we ask the reader to show in the exercises, this system has infinitely many solutions of the form 



where k is any real number. The values £=1,2,... 
giving the smallest solution. 



W= 10,366,482^ 
5 = 7,460,514* 
7=4,149,387* 
£ = 7,358,060* 
w = 7,206,360* 
b = 4,893,246* 
y = 5,439,213* 
^ = 3,515,820* 
*ive infinitely many positive integer solutions to the problem, with * = 1 



(4) 



EXAMPLE 5 India (fourth century a.d.) 




Fragment lll-5-3v of the Bakhshali Manuscript 



The Bakhshali Manuscript is an ancient work of Indian/Hindu mathematics dating from around the fourth century A.D., although 
some of its materials undoubtedly come from many centuries before. It consists of about 70 leaves or sheets of birch bark 
containing mathematical problems and their solutions. Many of its problems are so-called equalization problems that lead to 
systems of linear equations. One such problem on the fragment shown is the following: 



One merchant has seven asava horses, a second has nine hay a horses, and a third has ten camels. They are equally well off in 
the value of their animals if each gives two animals, one to each of the others. Find the price of each animal and the total value 
of the animals possessed by each merchant. 



Let x be the price of an asava horse, let y be the price of a haya horse, let z be the price of a camel, and the let K be the total value 
of the animals possessed by each merchant. Then the conditions of the problem lead to the following system of equations: 

5x+y -\-z = K 

x + ly+z = K (5) 

x I y I Sz = K 
The method of solution described in the manuscript begins by subtracting the quantity (^ | y \ z) from both sides of the three 
equations to obtain % — Sy = Iz = K — (x \ y \ z)- This shows that if the prices x, y, and z are to be integers, then the quantity 
K — (x \ y \ z) must be an integer that is divisible by 4, 6, and 7. The manuscript takes the product of these three numbers, or 
168, for the value K — {x \ y I z)> which yields x = 42. y = 28. and z = 24 for the prices and £" = 262 for the total value. (See 
Exercise 6 for more solutions to this problem.) 



Exercise Set 1 1 .4 



O 



Click here for Just Ask! 



The following lines from Book 12 of Homer's Odyssey relate a precursor of Archimedes' Cattle Problem: 
1. 

Thou shalt ascend the isle triangular, 
Where many oxen of the Sun are fed, 
And fatted flocks. Of oxen fifty head 
In every herd feed, and their herds are seven; 

And of his fat flocks is their number even. 

The last line means that there are as many sheep in all the flocks as there are oxen in all the herds. What is the total number of 
oxen and sheep that belong to the god of the Sun? (This was a difficult problem in Homer's day.) 

Solve the following problems from the Bakhshali Manuscript. 
2. 

(a) B possesses two times as much as A; C has three times as much as A and B together; D has four times as much as A, B, 
and C together. Their total possessions are 300. What is the possession of A? 

(b) B gives 2 times as much as A; C gives 3 times as much as B; D gives 4 times as much as C. Their total gift is 132. What 
is the gift of A? 

A problem on a Babylonian tablet requires finding the length and width of a rectangle given that the length and the width add 

3. up to 10, while the length and one-fourth of the width add up to 7. The solution provided on the tablet consists of the following 
four statements: 

Multiply 7 by 4 to obtain 28. 

Take away 10 from 28 to obtain 18. 

Take one-third of 18 to obtain 6, the length. 

Take away 6 from 10 to obtain 4, the width. 
Explain how these steps lead to the answer. 

The following two problems are from "The Nine Chapters of the Mathematical Art." Solve them using the array technique 

4. described in Example 3. 



(a) Five oxen and two sheep are worth 10 units and two oxen and five sheep are worth 8 units. What is the value of each ox 
and sheep? 

(b) There are three kinds of corn. The grains contained in two, three, and four bundles, respectively, of these three classes 
of corn, are not sufficient to make a whole measure. However, if we added to them one bundle of the second, third, and 
first classes, respectively, then the grains would become on full measure in each case. How many measures of grain 
does each bundle of the different classes contain? 



This problem in part (a) is known as the "Flower of Thymaridas," named after a Pythagorean of the fourth century B.C. 
5. 

(a) Given the n numbers a\,ct2, • • ., a M , solve fotx\,X2> •••>*« i n the following linear system: 

x\ -\-X2~\ \-Xn= a l 

X[ +* 2 = fl 2 
x\ +*3 =a^ 

X l ~\- X-fl ^ (%yi 

(b) Identify a problem in this exercise set that fits the pattern in part (a), and solve it using your general solution. 

For Example 5 from the Bakhshali Manuscript: 
6. 

(a) Express Equations 5 as a homogeneous linear system of three equations in four unknowns (jc, y, z, and K) and show that 
the solution set has one arbitrary parameter. 

(b) Find the smallest solution for which all four variables are positive integers. 

(c) Show that the solution given in Example 5 is included among your solutions. 



Solve the problems posed in the following three epigrams, which appear in a collection entitled "The Greek Anthology," which 
7. was compiled in part by a scholar named Metrodorus around A.D. 500. Some of its 46 mathematical problems are believed to 
date as far back as 600 B.C. (Before solving parts (a) and (c), you will have to formulate the question.) 

(a) I desire my two sons to receive the thousand staters of which I am possessed, but let the fifth part of the legitimate one's 
share exceed by ten the fourth part of what falls to the illegitimate one. 

(b) Make me a crown weighing sixty minae, mixing gold and brass, and with them tin and much- wrought iron. Let the gold 
and brass together form two-thirds, the gold and tin together three-fourths, and the gold and iron three-fifths. Tell me 
how much gold you must put in, how much brass, how much tin, and how much iron, so as to make the whole crown 
weigh sixty minae. 

(c) First person: I have what the second has and the third of what the third has. Second person: I have what the third has 
and the third of what the first has. Third person: And I have ten minae and the third of what the second has. 
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Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematical Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



(a) Solve Archimedes' Cattle Problem using a symbolic algebra program. 

(b) The Cattle Problem has a second part in which two additional conditions are imposed. The first of these states that 
"When thewhite bulls mingled their number with the black, they stood firm, equal in depth and breadth." This requires 
that W I B be a square number, i.e., 1, 4, 9, 16, 25, etc. Show that this requires that the values of k in Eq. 4 be 
restricted as follows: 

k = 4,456 ,149r 2 , r=l,2,3,„. 
and find the smallest total number of cattle that satisfies this second condition. 



Remark The second condition imposed in the second part of the Cattle Problem states that "When the yellow and the 
dappled bulls were gathered into one herd, they stood in such a manner that their number, beginning from one, grew slowly 
greater 'til it completed a triangular figure." This requires that the quantity Y \ £ be a triangular number — that is, a number of 
the form 1,1+2, 1+2 + 3, 1+2 + 3 + 4, .... This final part of the problem was not completely solved until 1965 when all 
206,545 digits of the smallest number of cattle that satisfies this condition were found using a computer. 

The following problem is from "The Nine Chapters of the Mathematical Art" and determines a homogeneous linear system 
T2. five equations in six unknowns. Show that the system has infinitely many solutions, and find the one for which the depth o 

the well and the lengths of the five ropes are the smallest possible positive integers. 



Suppose that five families share a well. Suppose further that 

2 of A's ropes are short of the well's depth by one of B's ropes. 

3 of B's ropes are short of the well's depth by one of C's ropes. 

4 of C's ropes are short of the well's depth by one of D's ropes. 

5 of D's ropes are short of the well's depth by one of E's ropes. 



6 of E's ropes are short of the well's depth by one of A's ropes. 
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11.5 

CUBIC SPLINE 
INTERPOLATION 



In this section an artist's drafting aid is used as a physical model for the 
mathematical problem of finding a curve that passes through specified points in 
the plane. The parameters of the curve are determined by solving a linear 
system of equations. 



Prerequisites: Linear Systems 
Matrix Algebra 
Differential Calculus 



Curve Fitting 

Fitting a curve through specified points in the plane is a common problem encountered in analyzing experimental data, in 
ascertaining the relations among variables, and in design work. In Figure 1 1.5.3 seven points in the xy-plane are displayed, and in 
Figure 11.5.2 a smooth curve has been drawn that passes through them. A curve that passes through a set of points in the plane is 
said to interpolate those points, and the curve is called an interpolating curve for those points. The interpolating curve in Figure 
1 1.5.1 was drawn with the aid of a drafting spline (Figure 1 1.5.2). This drafting aid consists of a thin, flexible strip of wood or 
other material that is bent to pass through the points to be interpolated. Attached sliding weights hold the spline in position while 
the artist draws the interpolating curve. The drafting spline will serve as the physical model for a mathematical theory of 
interpolation that we will discuss in this section. 



Figure 11.5.1 




Figure 11.5.2 



x 



Figure 11.5.3 



Statement of the Problem 



Suppose that we are given n points in the xy-plane, 

Ol, yfi, (x 2 , 72), 



Oh, 7h) 



which we wish to interpolate with a "well-behaved" curve (Figure 1 1.5.4). For convenience, we take the points to be equally 
spaced in the x-direction, although our results can easily be extended to the case of unequally spaced points. If we let the common 
distance between the x-coordinates of the points be h, then we have 

Let y = S(x), x\ < x < x n denote the interpolating curve that we seek. We assume that this curve describes the displacement of a 
drafting spline that interpolates the n points when the weights holding down the spline are situated precisely at the n points. It is 
known from linear beam theory that for small displacements, the fourth derivative of the displacement of a beam is zero along any 
interval of the x-axis that contains no external forces acting on the beam. If we treat our drafting spline as a thin beam and realize 
that the only external forces acting on it arise from the weights at the n specified points, then it follows that 



for values of x lying in the n _ ] open intervals 



between the n points. 



S^\x)=0 



Oh *2), 02, * 3 ),..., (* H _i, x n ) 



(1) 




Figure 11.5.4 



We also need the result from linear beam theory that states that for a beam acted upon only by external forces, the displacement 
must have two continuous derivatives. In the case of the interpolating curve y = S(x) constructed by the drafting spline, this means 
that S(x), S O), and S (x) must be continuous for * 1 < x < x n > 

The condition that S ff (x) be continuous is what causes a drafting spline to produce a pleasing curve, as it results in continuous 
curvature. The eye can perceive sudden changes in curvature — that is, discontinuities in S ff (x) — but sudden changes in higher 
derivatives are not discernible. Thus, the condition that S ff (x) be continuous is the minimal prerequisite for the interpolating curve 
to be perceptible as a single smooth curve, rather than as a series of separate curves pieced together. 

To determine the mathematical form of the function S(x), we observe that because S^(x) = i n the intervals between the n 
specified points, it follows by integrating this equation four times that S(x) must be a cubic polynomial in x in each such interval. 
In general, however, S(x) will be a different cubic polynomial in each interval, so S(x) must have the form 



£(*) = ] ^OO. *2<*<*3 

where ^ (^), fi^OO' • • *"Sm-1 00 are cu bic polynomials. For convenience, we will write these in the form 

S\(x) = a\(x — x\) \ b\(x — x\) \ c\(x — x\) \ d\ 7 x\<x <X2 

S2(x)=a2(x-X2) 3 I ^2(*-*2) 2 I c 2 (x - x 2 ) -\- d 2 , *2<*<*3 



(2) 



(3) 



The flj/s, i ■/ s , Cj#s, and ^ ./ s constitute a total of 4^ _ 4 coefficients that we must determine to specify S(x) completely. If we 
choose these coefficients so that S(x) interpolates the n specified points in the plane and S(x), S*(x), and S**(x) are continuous, 
then the resulting interpolating curve is called a cubic spline. 

Derivation of the Formula of a Cubic Spline 

From Equations 2 and 3, we have 

3 2 

S(x)=S\(x) =a\(x — x\) I b\(x — x\) \ c\(x — x\)+d\ r *i^*^* 2 



so 



and 



S(x)=S 2 (x) =a2(x-X2) 3 I ^2(^-^2) 2 I c 2 (*-* 2 ) I ^2- *2<*<*3 

S(x) = £ H _i (x) = fl H _i (x - x H _i) 3 I i H _! (x - * H _i) 2 f c H _i (* - *„_!) I rf H _i, x H _i < * < * H 

S*(x)=S*i(x) =3ai(x-xi) 2 i 2i 1 (jt — jc 1 ) I ei, *i<*<* 2 

S*(x)=S* 2 (x) =3a 2 (x-x 2 ) 3 ! 2£ 2 (*-* 2 ) I *2. *2<*<*3 

^(x)=^ H _i(x)=3fl H _i(x-x H _i) 2 I 2i H _i(x-x H _i) I c H _i, x„-i<x<Xn 

S n (x)=S*\(x) = 6a 1 (x-x 1 ) + 2b h *i<*<*2 

£"(*)= £" 2 (*) = 6a 2 (*-* 2 ) + 2£ 2 , * 2 <*<* 3 

£"00 = £ w M _i 0) = 6a h _i (* - jt„_i) 4- 2i H _i ? x H _i < * < * H 



(4) 



(5) 



(6) 



We will now use these equations and the four properties of cubic splines stated below to express the unknown coefficients a 3 -, £ ., 
Cj, rfj-, i = 1, 2, ..., w _ l, in terms of the known coordinates y 1? y^, ..., y M . 

£(*) interpolates the points (x u y t ), i = \>2, ..., n. Because S(x) interpolates the points (x i? j?.), ; = l, 2, ..., n, we have 

S(x\)=yh ^2) =72- — - S(x n )=y n 

From the first « - 1 of these equations and 4, we obtain 

^1=71 

^2=72 

: (8) 

^H-l =7H-1 

From the last equation in 7, the last equation in 4, and the fact that * H -x^-i = */ we obtain 

3 2 

a H _l& +i„_iA +c H _i^4-^ H -i =7h (9) 



2. S(x) is continuous on [^ 1? % n ]. Because S(x) is continuous for x\ < x < x n , it follows that at each point Xj in the set X2, 
x%, ..., x H _i we must have 

Otherwise, the graphs of £ 2 _i(» and s^ix) would not join together to form a continuous curve at * 2 . 
When we apply the interpolating property S t (xi) =y ir it follows from 10 that ^_i(z 2 ) =y if i = 2, 3, ..., 
n- 1, or from 4 that 

3 i 2 



3 2 

«2^ +^2^ +^2^+^2=73 



a n - 2 k 3 + *h-2^ 2 I ^h-2^ I ^h-2 =^h-1 



(ID 



3. S'(x) is continuous on [jq ? ^ ] . Because S f (x) is continuous for jq < x < x n , it follows that 

S' i -i(x i )=S' i (x i ), i = 2, 3,..., «-l 



or, from 5, 



3(3i^3 I 2ii^ +^1=^2 
3(32^3 + 2^2^ +^2 — c 3 



2 
3tf H _2& 4- 2i H _2^ + c H -2 — c h-1 



(12) 



4. S ff (x) is continuous on [* 1? j 2 ] • Because S ff (x) is continuous for X \ < x < x n , it follows that 

S" i . l (x i )=S" i (x i ), i = 2, 3,.., «-l 
or, from 6, 

6i3i^H-2ii = 2^2 
6^2^ + 2^2 = 2^3 

6a h _2^ + 2i H _2 = 2£ H _i 



(13) 



Equations 8, 9, 11, 12, and 13 constitute a system of 4^ _ g linear equations in the 4^ — 4 unknown coefficients a 2 , £ ., c 2 , ^ ., j = ], 
2, ..., « — 1. Consequently, we need two more equations to determine these coefficients uniquely. Before obtaining these additional 
equations, however, we can simplify our existing system by expressing the unknowns a 2 , £ ., c 2 , and ^ . in terms of new unknown 
quantities 

#!=£"[>!), M 2 = S"(x 2 ),..., M n = S"(x n ) 



and the known quantities 

For example, from 6 it follows that 



so 



vi. v> Vv 

M 2 = 2b 2 
M„-\ = 26 H _i 



b x = ±M x , b 2 = \u 2 ,..., b n . x =\M n . x 



Moreover, we already know from 8 that 

^1 =71, ^?2 = 72, ---> ^h-1 =^m-1 

We leave it as an exercise for the reader to derive the expressions for the atfs and c^ s in terms of the M 2 /s an d >V S - The final 
result is as follows: 

THEOREM 11.5.1 



Cubic Spline Interpolation 

Given n points {x\ 9 y\)> (x 2 , yi> ~" Oh, ?n) with *h-1 ~*i = & i = 1> 2, ..., fl _ l, the cubic spline 





/ 3 2 
ai(* — *l) 1 £i(*— *i) 1 ci(* — *i) 1 dfi, 


JTl <x < ^2 


S(x) = { 


<32(*-*2) 3 1 ^2(^-^2) 2 1 ^2(^-^2) 1 ^2* 


*2 ^* ^ x 3 




+ c H _l(x-x H _i)+d? H _i j 


A^ J ^^ A "^2 " H 



that interpolates these points has coefficients given by 

at = (M i+i - M{) i 6k 

bj = Mi!2 

c i =(y i+[ -y i )fk-[(M i+1 + 2M i )kf6] 

for i = 1, 2, ..., „ -1, where M i = S"(x i ) l i = \, 2, ..., n. 



(14) 



From this result, we see that the quantities m j, m 2 , . . ., jif uniquely determine the cubic spline. To find these quantities, we 
substitute the expressions for a,, £ ., and c, given in 14 into 12. After some algebraic simplification, we obtain 

1^1+4^2 + 1/3 = 601-272 \yi) lh2 



M 2 +4M 3 + 1/4=602-273 I 74)'^ 



M H _ 2 + AM»-i + Jl^ M = 60 H -2 - 2y„_i I 7 M ) /A' 



(15) 



or, in matrix form, 



14 10 
14 1 
14 











4 10 
14 10 
14 1 



Mi 
M 2 
M 3 
M 4 

M„_ 3 
M n -2 
M n -\ 



71-272+73 

yi - 2yi i 74 

73-274 + 75 

7h-4 - 27h-3 + 7h-2 

7m-3 - 27h-2 + 7h-1 

7h-2 - 27h-1 I 7h 



This is a linear system of n _ 2 equations for the n unknowns ji/j, jif 2 , . . ., M „• Thus, we still need two additional equations to 
determine M^ M2> ■ • •■> M n uniquely. The reason for this is that there are infinitely many cubic splines that interpolate the given 
points, so we simply do not have enough conditions to determine a unique cubic spline passing through the points. We discuss 
below three possible ways of specifying the two additional conditions required to obtain a unique cubic spline through the points. 
(The exercises present two more.) They are summarized in Table 1 . 



Table 1 



Natural The Second 

Spline derivative of the 

spline is zero at the 

endpoints. 



M { = 
M» = 



4 1 

1 4 1 







o o" 


" M 2 ' 







M 3 


6 
~ h 2 


1 4 1 


M n -2 


1 4 


Mn-\ 





J1-2JC2 I 73 

72 - 273 i 74 
y n -2 - tyn-i i y n 



Parabolic 


The spline reduces 


Mi = M2 


Runout 


to a parabolic curve 


My, = My, 


Spline 


on the first and last 
intervals. 





Cubic The spline is a 

Runout single cubic curve 

Spline on the first two and 

last two intervals. 



Ml = 2M 2 - M 3 
M n = 2M n -i - M n - 2 



5 1 

1 4 1 







6 

1 4 1 











1 4 1 
1 5 







1 4 1 
6 



M 2 
M 3 

Myi-2 

M n -\ 

M 2 
M 3 

M n - 2 

Myi-l 



71-272 1 73 

72 - 2^3 I 74 

yn-2 - 2y n -i 1 y n 



71-272 1 y3 
72 - 273 1 74 

yn-2 - tyn-l 1 y n 



The Natural Spline 

The two simplest mathematical conditions we can impose are 

Mi = M n = 
These conditions together with 15 result in an n x n linear system for M\> My ■■•■> My? which can be written in matrix form as 



10 
14 10 
14 1 






000" 


" Mi 







M 2 







M 3 


6 
~h 2 


1 4 1 


M n -\ 




1 


M n 






71 -272 +73 
72 - 2^3 I 74 

yn-2 ~ tyn-l I 7k 




For numerical calculations it is more convenient to eliminate Mi and M n fr° m this system and write 

4 10 0-000 ^ yi ~ 2y2 iy3 

14 10-000 M 3 72 - 2^3 I 74 

14 1 








000" 


M 2 







M 3 







M A 


6 

~ k 2 


1 4 1 


Mn-2 




1 4 


M n -i 





73 - 274 I 75 

7w-3-27w-2 + 7w-l 
yn-2 ~ 2 7h-1 H 7h 



(16) 



together with 



Mi = 



(17) 



M H = 

Thus, the (« _ 2) x (« — 2) linear system can be solved for the n _ 2 coefficients m 2 -> M 3 , ■ ■•■> M H _i> an d Mi an d M H are 
determined by 17 and 18. 



(18) 



Physically, the natural spline results when the ends of a drafting spline extend freely beyond the interpolating points without 
constraint. The end portions of the spline outside the interpolating points will fall on straight line paths, causing S (x) to vanish at 



the endpoints x\ and x H and resulting in the mathematical conditions Mi = M H = 0. 



The natural spline tends to flatten the interpolating curve at the endpoints, which may be undesirable. Of course, if it is required 
that S**(x) vanish at the endpoints, then the natural spline must be used. 

The Parabolic Runout Spline 

The two additional constraints imposed for this type of spline are 

Mi = M2 



(19) 



(20) 



M 2 




71-272 1 73 


M 3 




72 - 273 1 74 


M A 


6 
~k 2 


73 - 274 1 75 


M n -2 




7«-3-27«-2+7h-1 


M n -\ 




7h-2 - 27h-1 1 7h 



(21) 



M H = M H _i 
If we use the preceding two equations to eliminate M\ and M n from 15, we obtain the (« _ 2) x {n — 2) linear system 

5 10 0-000' 
14 10-000 
14 1-000 

0-141 
0-015 

for M 2 , My ••■■> M n -\- Once these n _ 2 values have been determined, M\ and M n are determined from 19 and 20. 

From 14 we see that M\ = Mi implies that fll = 0, and M n = M H _i implies that flfl _ 1 = fj. Thus, from 3 there are no cubic terms 
in the formula for the spline over the end intervals [ X \, x 2 ] an d [x n -\, x n ]- Hence, as the name suggests, the parabolic runout 
spline reduces to a parabolic curve over these end intervals. 

The Cubic Runout Spline 

For this type of spline, we impose the two additional conditions 

Mi = 2M 2 - M 2 (22) 



M n = 2M n -i - M n - 2 



(23) 



Using these two equations to eliminate Mi ana M H from 15 results in the following (« _ 2) x (n — 2) linear system for M 2 , My 

■■;M n -i- 



6 
14 10 
14 1 






o o" 


M 2 







M 3 







M 4 


6 


1 4 1 


M n -2 




6 


M n -i 





71-272 i 73 

72-273 I 74 
73-274 I 75 

yn-3-2yn-2+yn-\ 

7^-2-27^-1 I yn 



(24) 



After we solve this linear system for jy 2 , jjjf 3 , . . ., M H _i> we can use 22 and 23 to determine Mi an d M„- 

If we rewrite 22 as 

M2 — Mi = M3 — M2 
it follows from 14 that a 1 = a^ Because S m {x) = 6a \ on [ X \, X2] an d S fff (x) = 6^2 on [x2, ^3]' we see ^at S m (x) is constant 
over the entire interval [ X \ 9 x%] • Consequently, S(x) consists of a single cubic curve over the interval [ X \ 9 x%] rather than two 
different cubic curves pieced together at X2> [To see this, integrate S m (x) three times.] A similar analysis shows that S(x) consists 
of a single cubic curve over the last two intervals. 



Whereas the natural spline tends to produce an interpolating curve that is flat at the endpoints, the cubic runout spline has the 
opposite tendency: it produces a curve with pronounced curvature at the endpoints. If neither behavior is desired, the parabolic 
runout spline is a reasonable compromise. 



EXAMPLE 1 Using a Parabolic Runout Spline 



The density of water is well known to reach a maximum at a temperature slightly above freezing. Table 2, from the Handbook of 
Chemistry and Physics (Cleveland, Ohio: Chemical Rubber Publishing Company), gives the density of water in grams per cubic 
centimeter for five equally spaced temperatures from -10° C to 30° C. We will interpolate these five temperature-density 
measurements with a parabolic runout spline and attempt to find the maximum density of water in this range by finding the 
maximum value on this cubic spline. In the exercises we ask the reader to perform similar calculations using a natural spline and a 
cubic runout spline to interpolate the data points. 



Table 2 



raiure t 


^ Density (g/cnr) 


-10 


.99815 





.99987 


10 


.99973 


20 


.99823 


30 


.99567 



Set 



X\ = 


-10, 


71 = 


.99815 


*2 = 


0, 


72 = 


.99987 


*3 = 


10, 


73 = 


.99973 


x 4 = 


20, 


74 = 


.99823 


x 5 = 


30, 


75 = 


.99567 



Then 

6[7l-272 I 73] ih 2 = -.0001116 
6[72-273 I 74] lh 2 = -.0000816 
6 [73 -2y A I 75] lh 2 = -.0000636 
and the linear system 21 for the parabolic runout spline becomes 

Solving this system yields 



"5 1 0" 


~M 2 




1 4 1 


M 3 


= 


1 5 


M A 





-.000116 
-.000816 
-.000636 



M 2 = -.00001973 
M 3 = -.00001293 
M 4 = -.00001013 



From 19 and 20, we have 



M X =M 2 = -.00001973 
M 5 = M 4 = -.00001013 



Solving for the a,/s, b ,-/s> c,7s, and J ./ s in 14, we obtain the following expression for the interpolating parabolic runout spline: 



S(x) = 



- .00000987 (* + 10)^ + .0002707 (x + 10) + .99815, - 10 < x < 

.000000113 (x-0) 3 -.00000987 (x-0) 2 h .0000733 (x - 0) +.99987, 0<*<10 

.000000047 (x - 10) 3 - .00000647 (x - 10) 2 - .0000900 (x - 10) + .99973, 10 < x < 20 

-.00000507 (*-20) 2 -.0002053 (* - 20) I .99823, 20<*<30 



This spline is plotted in Figure 1 1.5.5. From that figure we see that the maximum is attained in the interval [0, 10]. To find this 
maximum, we set S f (x) equal to zero in the interval [0, 10]: 

£'(*) = .000000339* 2 -. 0000197* I .0000733 = 

To three significant digits the root of this quadratic in the interval [0, 10] is % = 3 .99, and for this value ofx, £f(3. 99) = 1.00001- 

Thus, according to our interpolated estimate, the maximum density of water is 1.00001 g/cm attained at 3.99° C. This agrees well 

with the experimental maximum density of 1.00000 g/cm attained at 3.98° C. (In the original metric system, the gram was defined 
as the mass of one cubic centimeter of water at its maximum density.) 
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Figure 11.5.5 



Closing Remarks 

In addition to producing excellent interpolating curves, cubic splines and their generalizations are useful for numerical integration 
and differentiation, for the numerical solution of differential and integral equations, and in optimization theory. 



Exercise Set 1 1 .5 



o 



Click here for Just Ask! 



Derive the expressions for a 3 - and Cj in Equations 14 of Theorem 1 1.5.1. 



The six points 



(0, .00000), (.2, .19367), (.4, .38942), (.6, .56464), (.8, .71736), (1.0, .84147) 



lie on the graph of y = sm *, where x is in radians. 



(a) Find the portion of the parabolic runout spline that interpolates these six points for .4 < x < .6- Maintain an accuracy of 
five decimal places in your calculations. 

(b) Calculate S(.5) for the spline you found in part (a). What is the percentage error of £(.5) with respect to the "exact" 
value of sm (.5) = . 47943? 



The following five points 
3 * (0, 1), (1, 7), (2, 27) (3, 79), (4, 181) 

lie on a single cubic curve. 

(a) Which of the three types of cubic splines (natural, parabolic runout, or cubic runout) would agree exactly with the single 
cubic curve on which the five points lie? 

(b) Determine the cubic spline you chose in part (a), and verify that it is a single cubic curve that interpolates the five 
points. 



Repeat the calculations in Example 1 using a natural spline to interpolate the five data points. 
4. 



Repeat the calculations in Example 1 using a cubic runout spline to interpolate the five data points. 
5. 



Consider the five points (0, 0), (.5, 1), (1, 0), (1.5, -1), and (2, 0) on the graph of y = sm (nx)- 
6. 

(a) Use a natural spline to interpolate the data points (0, 0), (.5, 1), and (1, 0). 

(b) Use a natural spline to interpolate the data points (.5, 1), (1, 0), and (1.5,-1). 

(c) Explain the unusual nature of your result in part (b). 



7. (The Periodic Spline) If it is known or if it is desired that the n points [ X \ r y\)i (x2, 72)' *' (*h* 7h) t0 ^ e interpolated lie 
on a single cycle of a periodic curve with period x n — x\, then an interpolating cubic spline £(*) must satisfy 

£'(*l) =£'(*„) 
(a) Show that these three periodicity conditions require that 



4M { + M 2 + Af H -i= 6 Om-1 — 2yi I 72) /^ 2 

(b) Using the three equations in part (a) and Equations 15, construct an (^ _ ]) x (m — 1) linear system for M\, M^ • • •» 
M H _i in matrix form. 



8. (The Clamped Spline) Suppose that, in addition to the n points to be interpolated, we are given specific values y\ and y* for 
the slopes S f (x\) and S*(x n ) of the interpolating cubic spline at the endpoints x\ and jr H . 

(a) Show that 

2Afi + m 2 = &(y 2 -y\ - Vi) /^ 2 

2Af H + M H _i = 6(> H _i-7 H I ky* n )!h 2 

(b) Using the equations in part (a) and Equations 15, construct an n x n linear system for Jif 1? ji^ • •> M H in matrix form. 



Remark The clamped spline described in this exercise is the most accurate type of spline for interpolation work if the slopes at 
the endpoints are known or can be estimated. 



Section 1 1 .5 



® 



Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



In the solution of the natural cubic spline problem, it is necessary to solve a system of equations having coefficient matrix 



An - 



If we can present a formula for the inverse of this matrix, then the solution for the natural cubic spline problem can be easil 
obtained. In this exercise and the next, we use a computer to discover this formula. Toward this end, we first determine an 
expression for the determinant of £ , denoted by the symbol £) H . Given that 

^4 1" 



4 


1 ■- 











1 


4 1 •• 














0- 


1 


4 


1 





0- 





1 


4 



A\ = [4] and A 2 = 



1 4 



we see that 



Di=det( J 4i) = det[4] =4 



and 



D 2 = det (Aj) = det 



4 1 
1 4 



= 15 



(a) Use the cofactor expansion of determinants to show that 

D M = 4Z) H _i-D„_2 
for n = 3, 4, 5, .... This says, for example, that 

D 3 =4D 2 -D { =4(15) -4 = 56 
D 4 = 4D 3 -D 2 =4(56)-15 = 209 

and so on. Using a computer, check this result for 5<«< 10. 

(b) By writing 

D M = 4Z) H _i-D„_2 
and the identity, £> M _i =D n -\, in matrix form, 



show that 



" Dn ' 




"4 


-l" 


"A,-i" 


D n -x 




1 


0_ 


D„-2 



' D n ' 




"4 


-1" 


M-2 


~D 2 ~ 




"4 


-1" 


M-2 


"15" 


Dn-\ 




1 


0_ 




Di_ 




1 


0_ 




_ 4_ 



(c) Use the methods in Section 7.2 and a computer to show that 



4 -1 
1 



H-2 



(2 + fir 1 - (2 - fir 1 (2 - fir 2 - (2 , fir 2 ' 
(2 1 fir 2 - (2 - fir 2 (2 - fir 3 - (.2 , fir 3 



2fi 



and hence 



(2 I ! /3)" +1 -(2- ! /3) 



H + l 



2^3 



for w = i / 2, 3, .... 

(d) Using a computer, check this result for 1 < n < 10. 



In this exercise, we determine a formula for calculating ^-1 from £) k for k = 0, 1, 2, 3, . . ., n, assuming that B^ is defined t 
bel. 



(a) Use a computer to compute ^-l for ^ — ], 2, 3, 4, and 5. 



(b) From your results in part (a), discover the conjecture that 



A l = [t*ij] 






where ay = o^- and 

*y = (-lV 
for z< j. 

(c) Use the result in part (b) to compute ^-1 and compare it to the result obtained using the computer. 
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11.6 

MARKOV CHAINS 



In this section we describe a general model of a system that changes from state 
to state. We then apply the model to several concrete problems. 







Prerequisites: 


Linear Systems 




Matrices 




Intuitive Understanding of Limits 







A Markov Process 

Suppose a physical or mathematical system undergoes a process of change such that at any moment it can occupy one of a finite 
number of states. For example, the weather in a certain city could be in one of three possible states: sunny, cloudy, or rainy. Or an 
individual could be in one of four possible emotional states: happy, sad, angry, or apprehensive. Suppose that such a system 
changes with time from one state to another and at scheduled times the state of the system is observed. If the state of the system at 
any observation cannot be predicted with certainty, but the probability that a given state occurs can be predicted by just knowing 
the state of the system at the preceding observation, then the process of change is called a Markov chain or Markov process. 



DEFINITION 



If a Markov chain has k possible states, which we label as 1, 2, . . ., k, then the probability that the system is in state i at any 
observation after it was in state j at the preceding observation is denoted by Pij and is called the transition probability from 
state j to state i. The matrix P = [p t ,] is called the transition matrix of the Markov chain. 



for example, in a three- state Markov chain, the transition matrix has the form 

Preceding Stiite 
I 2 I 



Pu 

P2\ 



Pn 


#19 


] 


m 


$g 


2 


Ri- 


f>3> 


3 



\v\\ State 



In this matrix, p^2 * s the probability that the system will change from state 2 to state 3, pn is the probability that the system will 
still be in state 1 if it was previously in state 1, and so forth. 



EXAMPLE 1 Transition Matrix of the Markov Chain 



A car rental agency has three rental locations, denoted by 1, 2, and 3. A customer may rent a car from any of the three locations 
and return the car to any of the three locations. The manager finds that customers return the cars to the various locations according 
to the following probabilities: 



Rented from Location 
1 2 J 

\H 3 M' 
A 2 ,6 
.1 ,5 *2 



Returned 

to 
Location 



This matrix is the transition matrix of the system considered as a Markov chain. From this matrix, the probability is .6 that a car 
rented from location 3 will be returned to location 2, the probability is .8 that a car rented from location 1 will be returned to 
location 1, and so forth. 



EXAMPLE 2 Transition Matrix of the Markov Chain 



By reviewing its donation records, the alumni office of a college finds that 80% of its alumni who contribute to the annual fund 
one year will also contribute the next year, and 30% of those who do not contribute one year will contribute the next. This can be 
viewed as a Markov chain with two states: state 1 corresponds to an alumnus giving a donation in any one year, and state 2 
corresponds to the alumnus not giving a donation in that year. The transition matrix is 

.8 3 



P = 



.2 .7 



In the examples above, the transition matrices of the Markov chains have the property that the entries in any column sum to 1 . This 
is not accidental. TfP=[p i Ais the transition matrix of any Markov chain with k states, then for each j we must have 

P\j+P2j+™ + Pkj=l (1) 

because if the system is in state j at one observation, it is certain to be in one of the k possible states at the next observation. 

A matrix with property 1 is called a stochastic matrix, a probability matrix, or a Markov matrix. From the preceding discussion, it 
follows that the transition matrix for a Markov chain must be a stochastic matrix. 

In a Markov chain, the state of the system at any observation time cannot generally be determined with certainty. The best one can 
usually do is specify probabilities for each of the possible states. For example, in a Markov chain with three states, we might 
describe the possible state of the system at some observation time by a column vector 



x = 



*2 
*3 



in which x \ is the probability that the system is in state 1, x 2 the probability that it is in state 2, and x% the probability that it is in 
state 3. In general we make the following definition. 



DEFINITION 



The state vector for an observation of a Markov chain with k states is a column vector x whose /th component Xj is the 
probability that the system is in the /th state at that time. 



Observe that the entries in any state vector for a Markov chain are nonnegative and have a sum of 1 . (Why?) A column vector that 
has this property is called a probability vector. 

Let us suppose now that we know the state vector x © for a Markov chain at some initial observation. The following theorem will 
enable us to determine the state vectors 

A. , A. 7 . . ., Jt 7 - - - 



at the subsequent observation times. 



THEOREM 11.6.1 




The proof of this theorem involves ideas from probability theory and will not be given here. From this theorem, it follows that 

x (2) = p x Q) = p2 x <U) 

x V) = P x V>=pi x V) 
In this way, the initial state vector x © and the transition matrix P determine X C») for « = 1, 2, . . . . 



EXAMPLE 3 Example 2 Revisited 



The transition matrix in Example 2 was 



P = 



.8 .3 
.2 .7 



We now construct the probable future donation record of a new graduate who did not give a donation in the initial year after 
graduation. For such a graduate the system is initially in state 2 with certainty, so the initial state vector is 

0" 



JP> = 



1 



From Theorem 11.6.1 we then have 



x v =Px m = 

x C2) =Px (D = 



.8 .3 

.2 .7 

.8 .3 

.2 .7 

.8 .3 

.2 .7 




1 

.3 
.7 

.45 
.55 



.45 
.55 

.525 
.475 



Thus, after three years the alumnus can be expected to make a donation with probability .525. Beyond three years, we find the 
following state vectors (to three decimal places): 



x^ = 



x® = 



".563" 
_.438_ 


, x^ = 


".581" 
.419 


"598" 
_.402_ 


. *^ = 


"599" 
_.401_ 






.591 
.409 

.599 
.401 






.595 
.405 

.600 
.400 



For all n beyond 1 1, we have 



j& = 



.600 
.400 



to three decimal places. In other words, the state vectors converge to a fixed vector as the number of observations increases. (We 
shall discuss this further below.) 



EXAMPLE 4 Example 1 Revisited 



The transition matrix in Example 1 was 

"8 .3 .2 
.1 .2 .6 
.1 .5 .2 

If a car is rented initially from location 2, then the initial state vector is 


1 




x© = 



Using this vector and Theorem 1 1.6.1, one obtains the later state vectors listed in Table 1. 



Table 1 


























n 


x®> 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 







.300 


.400 


.477 


.511 


.533 


.544 


.550 


.553 


.555 


.556 


.557 


x 2 


1 


.200 


.370 


.252 


.261 


.240 


.238 


.233 


.232 


.231 


.230 


.230 


*3 





.500 


.230 


.271 


.228 


.227 


.219 


.217 


.215 


.214 


.214 


.213 



For all values of n greater than 1 1, all state vectors are equal to xC 11 ) to three decimal places. 

Two things should be observed in this example. First, it was not necessary to know how long a customer kept the car. That is, in a 
Markov process the time period between observations need not be regular. Second, the state vectors approach a fixed vector as n 
increases, just as in the first example. 



EXAMPLE 5 Using Theorem 1 1 .6.1 



A traffic officer is assigned to control the traffic at the eight intersections indicated in Figure 1 1.6.1. She is instructed to remain at 
each intersection for an hour and then to either remain at the same intersection or move to a neighboring intersection. To avoid 
establishing a pattern, she is told to choose her new intersection on a random basis, with each possible choice equally likely. For 
example, if she is at intersection 5, her next intersection can be 2, 4, 5, or 8, each with probability j-. Every day she starts at the 
location where she stopped the day before. The transition matrix for this Markov chain is 
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Figure 11.6.1 

If the traffic officer begins at intersection 5, her probable locations, hour by hour, are given by the state vectors given in Table 2. 
For all values of n greater than 22, all state vectors are equal to X (22D to three decimal places. Thus, as with the first two examples, 
the state vectors approach a fixed vector as n increases. 



Table 2 




(«) .000 .133 .116 .130 .123 .113 



(n) .250 
x 2 



<ri) .000 
x 3 



(n) .250 
x 4 



.146 .163 .140 .138 .115 



.050 .039 .067 .073 .100 



.113 .187 .162 .178 .178 



109 


.108 


.107 


109 


.108 


.107 


106 


.107 


.107 


179 


.179 


.179 



(«) 1 .250 .279 .190 .190 .168 .149 .144 .143 .143 



Cm) .000 
x 6 



.000 .050 .056 .074 .099 .105 .107 .107 





n 


x«> 





1 


2 


3 


4 


5 


10 


15 


20 


22 


(») 





.000 


.133 


.104 


.131 


.125 


.138 


.142 


.143 


.143 


*7 






















Wl 





.250 


.146 


.152 


.124 


.121 


.108 


.107 


.107 


.107 


*8 























Limiting Behavior of the State Vectors 

In our examples we saw that the state vectors approached some fixed vector as the number of observations increased. We now ask 
whether the state vectors always approach a fixed vector in a Markov chain. A simple example shows that this is not the case. 



EXAMPLE 6 System Oscillates between Two State Vectors 



Let 



P = 



1 

1 



and x^ = 



Then, because pi _ j and pi _ p, we have that 



and 



x^ = xCW4> = ...= 



x <0 =x © =x <5> = ... = 



This system oscillates indefinitely between the two state vectors 



and 



so it does not approach any fixed vector. 



However, if we impose a mild condition on the transition matrix, we can show that a fixed limiting state vector is approached. This 
condition is described by the following definition. 



DEFINITION 



A transition matrix is regular if some integer power of it has all positive entries. 



Thus, for a regular transition matrix P, there is some positive integer m such that all entries of P m are positive. This is the case with 
the transition matrices of Examples Example 1 and Example 2 for m — \ . In Example 5 it turns out that p 4 has all positive entries. 
Consequently, in all three examples the transition matrices are regular. 

A Markov chain that is governed by a regular transition matrix is called a regular Markov chain. We shall see that every regular 
Markov chain has a fixed state vector q such that p n x^ approaches q as n increases for any choice of x ®. This result is of major 
importance in the theory of Markov chains. It is based on the following theorem. 



THEOREM 11.6.2 



Behavior of p> as n -> * 

If P is a regular transition matrix, then as n » oo ; 





<?i 4\ ■ 


■■ 4\ 


P n ^ 


42 42 ■ 


■■ 42 




4k 4k • 


- qk 



where the ^ are positive numbers such that qx \ q2 \ \-qk = 1- 



We will not prove this theorem here. The interested reader is referred to a more specialized text, such as J. Kemeny and J. Snell, 
Finite Markov Chains (New York: Springer- Verlag, 1976). 



Let us set 



Q = 



4\ 


4\ • 


■■ q\~ 


42 


42 ■ 


" 42 


4k 


4k • 


" 4k 



and 



q = 



4\ 
42 

4k 



Thus, Q is a transition matrix, all of whose columns are equal to the probability vector q. Q has the property that if x is any 
probability vector, then 

>1*1 +<?1*2H \-4l*k 



S2x = 



4\ 4\ ■ 


" 4\~ 


"*1~ 




42 42 ■ 


- 42 


*2 


= 


4k 4k • 


" 4k 


*k 





= (x { +x 2 -\ \-Xk) 



41 
42 

4k 



42*1 +42X2 + - + 42*k 
4k*\ I 4k*2 + - + 4kXk 

= (l)q=q 



That is, Q transforms any probability vector x into the fixed probability vector q. This result leads to the following theorem. 
THEOREM 11.6.3 



Behavior of p"x asn-^ ^ 

IfP is a regular transition matrix andx is any probability vector, then as n > oo , 



/>"x-> 


41 
42 

4k 


= q 



where q is a fixed probability vector, independent of n, all of whose entries are positive. 



This result holds since Theorem 1 1.6.2 implies that P n » Q as w > -x • This in turn implies that P n x — * (3x = q as 

« * oo • Thus, for a regular Markov chain, the system eventually approaches a fixed state vector q. The vector q is called the 

steady-state vector of the regular Markov chain. 

For systems with many states, usually the most efficient technique of computing the steady-state vector q is simply to calculate P n x 
for some large n. Our examples illustrate this procedure. Each is a regular Markov process, so that convergence to a steady-state 
vector is ensured. Another way of computing the steady-state vector is to make use of the following theorem. 



THEOREM 11.6.4 



Steady-State Vector 

The steady -state vector qofa regular transition matrix P is the unique probability vector that satisfies the equation Px\ = q. 



To see this, consider the matrix identity ppn _ pn+l\ By Theorem 1 1.6.2, both P n and p»+l approach Q as n > oo • Thus, we 

have PQ = Q. Any one column of this matrix equation gives P<\ = q. To show that q is the only probability vector that satisfies this 
equation, suppose r is another probability vector such that Py = r. Then also P"r = r for n = 1, 2, . . .. When we let n — * oo , 
Theorem 1 1.6.3 leads to q = r. 

Theorem 1 1.6.4 can also be expressed by the statement that the homogeneous linear system 

(I-P)q = 

has a unique solution vector q with nonnegative entries that satisfy the condition qy \ q 2 -\ h q& = 1 • We can apply this 

technique to the computation of the steady-state vectors for our examples. 



EXAMPLE 7 Example 2 Revisited 



In Example 2 the transition matrix was 



so the linear system (/ _ p)q = o is 



This leads to the single independent equation 



or 



P = 



.2 -.3 
-.2 .3 



.8 .3 
.2 .7 



■71 
■72 



(2) 



■2^1 — .3^2 = 

«a , i = i-5'a , 2 



Thus, when we set #2 = s > anv solution of 2 is of the form 



q = S 



1.5 
1 



where s is an arbitrary constant. To make the vector q a probability vector, we set s = l / (1.5 + l) = .4. Consequently, 

".6" 

q= U 

is the steady-state vector of this regular Markov chain. This means that over the long run, 60% of the alumni will give a donation 
in any one year, and 40% will not. Observe that this agrees with the result obtained numerically in Example 3. 



EXAMPLE 8 Example 1 Revisited 



In Example 1 the transition matrix was 



p = 



.8 .3 .2 
.1 .2 .6 
.1 .5 .2 



so the linear system (/ — P)q = is 



.2 -.3 -.2 
-.1 .8 -.6 
-.1 -.5 .8 

The reduced row-echelon form of the coefficient matrix is (verify) 



r<?r 




"0" 


« 


= 





|_<?3 








1 


34" 
13 


1 


14 
13 









so the original linear system is equivalent to the system 



41 ljfj<?3 

«=(£>> 

When we set qi = s, any solution of the linear system is of the form 

34 
13 

!± 
13 

1 



q = s 



To make this a probability vector, we set 



Thus, the steady-state vector of the system is 



s = ■ 



1 



34 M +1 
13 13 



61 



q = 



34 
61 

61 

II 
61 



.5573. 
.2295. 
.2131. 



This agrees with the result obtained numerically in Table 1 . The entries of q give the long-run probabilities that any one car will be 
returned to location 1, 2, or 3, respectively. If the car rental agency has a fleet of 1000 cars, it should design its facilities so that 
there are at least 558 spaces at location 1, at least 230 spaces at location 2, and at least 214 spaces at location 3. 



EXAMPLE 9 Example 5 Revisited 



We will not give the details of the calculations but simply state that the unique probability vector solution of the linear system 



3_ 
28 

3_ 
28 

3_ 
28 

5_ 
28 

4_ 
28 

3_ 
28 

28 
J_ 
28 



.1071. 
.1071. 
.1071. 
.1785. 
.1428. 
.1071. 
.1428. 
.1071. 



The entries in this vector indicate the proportion of time the traffic officer spends at each intersection over the long term. Thus, if 
the objective is for her to spend the same proportion of time at each intersection, then the strategy of random movement with equal 
probabilities from one intersection to another is not a good one. (See Exercise 5.) 



Exercise Set 1 1 .6 



@ 



Click here for Just Ask! 



Consider the transition matrix 



1. 



P = 



.4 .5 
.6 .5 



(a) 



Calculate X C») for » = l, 2, 3, 4, 5, if ^ = 



(b) State why P is regular and find its steady-state vector. 



Consider the transition matrix 



P = 



.2 .1 .7 
.6 .4 .2 
.2 .5 .1 



(a) Calculate x 0), X C2), and K (2) to three decimal places if 



M = 



(b) State why P is regular and find its steady- state vector. 



3. 



Find the steady-state vectors of the following regular transition matrices: 



(a) I 3 



"l 


3" 


3 


4 


2 


1 


3 


4 



(b) 



.81 .26 
.19 .74 



(c) 



1 


1 





3 


2 




1 


o 


1 


3 




4 


1 


1 


3 


3 


2 


4 



Let P be the transition matrix 



(a) Show that P is not regular. 



"l 





2 




1 


1 


2 





(b) 



Show that as n increases, p n \$$) approaches 



for any initial state vector x ®. 



(c) What conclusion of Theorem 1 1 .6.3 is not valid for the steady state of this transition matrix? 



Verify that if P is a fc x k regular transition matrix all of whose row sums are equal to 1, then the entries of its steady- state 
5. vector are all equal to l / £. 

Show that the transition matrix 



P = 



is regular, and use Exercise 5 to find its steady-state vector. 

John is either happy or sad. If he is happy one day, then he is happy the next day four times out of five. If he is sad one day, 
7. then he is sad the next day one time out of three. Over the long term, what are the chances that John is happy on any given day? 






1 


r 




2 


2 


1 


1 


o 


2 


2 




1 





1 


2 




2 



A country is divided into three demographic regions. It is found that each year 5% of the residents of region 1 move to region 
8. and 5% move to region 3. Of the residents of region 2, 15% move to region 1 and 10% move to region 3. And of the residents 



of region 3, 10% move to region 1 and 5% move to region 2. What percentage of the population resides in each of the three 
regions after a long period of time? 



Section 11.6 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



Consider the sequence of transition matrices 



with 



(P2.P3.P4.-) 



Pi= 



p A = 






1 
2 
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? 






2 















1 
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n 


n 
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1 
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4 
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2 
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1 


1 


1 


1 


2 


3 


4 



P3 = 






r 
3 


<4 


1 
3 


M 


1 
3 



P5 = 



^ 
4 i 







4 

I I 
3 4 



I I I 

2 3 4 

1 I I I 
2 3 4 



and so on. 



(a) Use a computer to show that each of these four matrices is regular by computing their squares. 



(b) Verify Theorem 1 1 .6.2 by computing the 100th power of p^ for fc = 2, 3, 4, 5. Then make a conjecture as to the 
limiting value of p" as n — - t x for all k = Z 3, 4, ... . 

(c) Verify that the common column q^ of the limiting matrix you found in part (b) satisfies the equation P k f ik = q k , as 
required by Theorem 1 1.6.4. 



A mouse is placed in a box with nine rooms as shown in the accompanying figure. Assume that it is equally likely that the 
T2. mouse goes through any door in the room or stays in the room. 



(a) Construct the 9 x 9 transition matrix for this problem and show that it is regular. 



(b) Determine the steady- state vector for the matrix. 



(c) Use a symmetry argument to show that this problem may be solved using only a 3 x 3 matrix. 



L=r _ L=r 




Figure Ex-T2 
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1.7 ln this section we introduce matrix representations of relations among members 

C R A PH TH FDRY of a set ' We use matrix arithmetic to analyze these relationships. 



Prerequisites: Matrix Addition and Multiplication 



Relations among Members of a Set 

There are countless examples of sets with finitely many members in which some relation exists among members of the set. For 
example, the set could consist of a collection of people, animals, countries, companies, sports teams, or cities; and the relation 
between two members, A and 5, of such a set could be that person A dominates person 5, animal A feeds on animal B, country A 
militarily supports country B, company A sells its product to company 5, sports team A consistently beats sports team 5, or city A 
has a direct airline flight to city B. 

We shall now show how the theory of directed graphs can be used to mathematically model relations such as those in the 
preceding examples. 

Directed Graphs 

A directed graph is a finite set of elements, {/^ p 2r _._ ? P n ) together with a finite collection of ordered pairs (P u PA of distinct 
elements of this set, with no ordered pair being repeated. The elements of the set are called vertices, and the ordered pairs are 
called directed edges, of the directed graph. We use the notation p i — > p, (which is read "p i is connected to P,") to indicate that 
the directed edge (P i? PA belongs to the directed graph. Geometrically, we can visualize a directed graph (Figure 1 1.7.1) by 
representing the vertices as points in the plane and representing the directed edge p i — > P, by drawing a line or arc from vertex 
p i to vertex Pp with an arrow pointing from p i to Pj. If both P i — > p, and P, — > P i hold (denoted P i . , PX we draw a single 
line between p t and p.- with two oppositely pointing arrows (as with p 2 and p^ in the figure). 



P, 



p 




Figure 11.7.1 

As in Figure 1 1.7.1, for example, a directed graph may have separate "components" of vertices that are connected only among 

themselves; and some vertices, such as p 5 , may not be connected with any other vertex. Also, because p i > p t is not permitted 

in a directed graph, a vertex cannot be connected with itself by a single arc that does not pass through any other vertex. 

Figure 1 1.7.2 shows diagrams representing three more examples of directed graphs. With a directed graph having n vertices, we 
may associate an n x n matrix M = [way], called the vertex matrix of the directed graph. Its elements are defined by 



JV 



-4 — K- 



*ft 



un 





Figure 11.7.2 

fl, ifPf 



Wy = 



otherwise 
for i, j = 1 , 2, ... n. For the three directed graphs in Figure 1 1 .7.2, the corresponding vertex matrices are 



Figure 11.7.2a: M = 
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Figure 11. 7. 26: M = 
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Figure 11.7.2c: M = 



By their definition, vertex matrices have the following two properties: 



(i) All entries are either or 1 . 






1 o o" 


1 


1 


1 


1 


1 






(ii) All diagonal entries are 0. 



Conversely, any matrix with these two properties determines a unique directed graph having the given matrix as its vertex matrix. 
For example, the matrix 

^0 1 1 0" 

10 

10 1 

l o 

determines the directed graph in Figure 1 1.7.3. 



M = 




Figure 11.7.3 



EXAMPLE 1 Influences within a Family 



A certain family consists of a mother, father, daughter, and two sons. The family members have influence, or power, over each 
other in the following ways: the mother can influence the daughter and the oldest son; the father can influence the two sons; the 
daughter can influence the father; the oldest son can influence the youngest son; and the youngest son can influence the mother. 
We may model this family influence pattern with a directed graph whose vertices are the five family members. If family member A 

influences family member 5, we write A > B- Figure 1 1.7.4 is the resulting directed graph, where we have used obvious letter 

designations for the five family members. The vertex matrix of this directed graph is 



If 
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Figure 11.7.4 



EXAMPLE 2 Vertex Matrix: Moves on a Chessboard 



In chess the knight moves in an "L"-shaped pattern about the chessboard. For the board in Figure 1 1.7.5 it may move horizontally 
two squares and then vertically one square, or it may move vertically two squares and then horizontally one square. Thus, from the 
center square in the figure, the knight may move to any of the eight marked shaded squares. Suppose that the knight is restricted to 



the nine numbered squares in Figure 1 1 .7.6. If by i — > j we mean that the knight may move from square i to square j, the directed 
graph in Figure 1 1.7.7 illustrates all possible moves that the knight may make among these nine squares. In Figure 1 1.7.8 we have 
"unraveled" Figure 1 1.7.7 to make the pattern of possible moves clearer. 
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Figure 11.7.5 
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Figure 11.7.8 



The vertex matrix of this directed graph is given by 



M = 



10 10 

10 1 

10 10 

10 1 
000000000 

10 10 

10 10 

10 10 

10 10 



In Example 1 the father cannot directly influence the mother; that is, p > M is not true. But he can influence the youngest son, 

who can then influence the mother. We write this as p — > YS — > M and call it a 2-step connection from F to M. Analogously, 
we call M — > L a 1-step connection, p — > OS — > YS — > M a 3-step connection, and so forth. Let us now consider a 
technique for finding the number of all possible r-step connections (> = \ i 2, ...) from one vertex p i to another vertex p, of an 
arbitrary directed graph. (This will include the case when p i and f> are the same vertex.) The number of 1-step connections from 
p i to Pi is simply m^. That is, there is either zero or one 1-step connection from p i to p., depending on whether m^ is zero or 

one. For the number of 2-step connections, we consider the square of the vertex matrix. If we let m &> be the (j r j)-th element of 

M 2 , we have 



.© 



! y =m i \m\j I W3j2WS2jH \- ™in™Yij 



WS,- V = 



(1) 



Now, if mn = m \ i = 1 , there is a 2-step connection p i — > p± — » p .■ from p i to Py But if either mn or ws y is zero, such a 2-step 
connection is not possible. Thus P i — » fj — » P is a 2-step connection if and only if mumu = 1- Similarly, for any fc — \, 2, 
... 9 n 9 P i — * P^ — jH P, is a 2-step connection from p i to p,- if and only if the term m^m^ on the right side of 1 is one; 
otherwise, the term is zero. Thus, the right side of 1 is the total number of two 2-step connections from p i to p.-. 

A similar argument will work for finding the number of 3-, 4-, . . ., ^z-step connections from p i to Py In general, we have the 
following result. 



THEOREM 11.7.1 



Let M be the vertex matrix of a directed graph and let m y) be the (j ? j)-th element of M T . Then m y) is equal to the number of 
r-step connections from p. to P~ 



EXAMPLE 3 Using Theorem 1 1 .7.1 



Figure 1 1.7.9 is the route map of a small airline that services the four cities p^ p 2 , P 3 , P$. As a directed graph, its vertex matrix is 

1 1 0" 

10 10 
10 1 
110 



M = 



We have that 



M 2 = 
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1 1 



and M J = 
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M3 



Figure 11.7.9 

If we are interested in connections from city p 4 to city p 3 , we may use Theorem 1 1.7.1 to find their number. Because m ^ — \, 
there is one 1-step connection; because m $) _ }, there is one 2-step connection; and because m ^ _ 3, there are three 3-step 
connections. To verify this, from Figure 1 1.7.9 we find 

1 - step connections from P4 to P3: P4 — > P3 

2 - step connections from P4 to P3: P4 — * P2 — * ^3 

3 - step connections from P4 to ^3 : P4 — ^ P3 — > P4 - 

^4 ^ ^3 ^ ^1- 



P3 
P3 
P3 



Cliques 

In everyday language a "clique" is a closely knit group of people (usually three or more) that tends to communicate within itself 
and has no place for outsiders. In graph theory this concept is given a more precise meaning. 



DEFINITION 



A subset of a directed graph is called a clique if it satisfies the following three conditions: 



(i) The subset contains at least three vertices. 



(ii) For each pair of vertices p i and Pj in the subset, both P i — > Pj and Pj — > P i are true. 



(iii) The subset is as large as possible; that is, it is not possible to add another vertex to the subset and still satisfy condition 
(ii). 



This definition suggests that cliques are maximal subsets that are in perfect "communication" with each other. For example, if the 
vertices represent cities, and p i — » P, means that there is a direct airline flight from city p i to city P,-, then there is a direct flight 
between any two cities within a clique in either direction. 



EXAMPLE 4 A Directed Graph with Two Cliques 



The directed graph illustrated in Figure 1 1.7.10 (which might represent the route map of an airline) has two cliques: 

{Pl,P 2 ,P 3 ,P4} ™d {P3,P4.P6} 

This example shows that a directed graph may contain several cliques and that a vertex may simultaneously belong to more than 
one clique. 




For simple directed graphs, cliques can be found by inspection. But for large directed graphs, it would be desirable to have a 
systematic procedure for detecting cliques. For this purpose, it will be helpful to define a matrix S = [sy] related to a given 
directed graph as follows: 

'1. SPi-Pj 
0, otherwise 



s i} = 



The matrix S determines a directed graph that is the same as the given directed graph, with the exception that the directed edges 
with only one arrow are deleted. For example, if the original directed graph is given by Figure 1 1.7.1 la, the directed graph that has 
S as its vertex matrix is given in Figure llJAlb. The matrix S may be obtained from the vertex matrix M of the original directed 
graph by setting s 2 - ■ = 1 if mu = m^ = 1 and setting s 2 - = otherwise. 





Figure 11.7.11 



The following theorem, which uses the matrix S, is helpful for identifying cliques. 



THEOREM 11.7.2 



Identifying Cliques 

Let s y? be the (j ? j'yth element of *£ 3 . Then a vertex p i belongs to some clique if and only if J?) ^ q. 



Proof If C5 ^ q, then there is at least one 3-step connection from p i to itself in the modified directed graph determined by S. 
■* P ■ — * Pfc — * Py In the modified directed graph, all directed relations are two-way, so we also have the 



Suppose it is p i — > p, 

connections P i ^ P< ^P^^ P v But this means that {P i? P, P k ) is either a clique or a subset of a clique. In either case, p i must 

belong to some clique. The converse statement, "if p i belongs to a clique, then J^ ^ q," follows in a similar manner. 



EXAMPLE 5 Using Theorem 1 1 .7.2 



Suppose that a directed graph has as its vertex matrix 



M = 






1 


1 


f 


1 





1 








1 





1 


1 












Then 



S = 



and 



s 3 = 






3 





2 


3 





2 








2 





1 


2 





1 






Because all diagonal entries of £ 3 are zero, it follows from Theorem 11.7.2 that the directed graph has no cliques. 



EXAMPLE 6 Using Theorem 1 1 .7.2 



Suppose that a directed graph has as its vertex matrix 



M = 






1 





1 


f 


1 








1 





1 


1 





1 





1 


1 











1 








1 






Then 



s= 






1 





1 


1 


1 








1 




















1 


1 











1 















and S* = 



2 


4 





4 


3 


4 


2 





3 


1 

















4 


3 





2 


1 


3 


1 





1 






The nonzero diagonal entries of £ 3 are J&, J?), and J?). Consequently, in the given directed graph, p^ 9 p 2 , and p 4 belong to 
cliques. Because a clique must contain at least three vertices, the directed graph has only one clique, [p^ ? p 2? P^} . 

Dominance-Directed Graphs 

In many groups of individuals or animals, there is a definite "pecking order" or dominance relation between any two members of 
the group. That is, given any two individuals A and 5, either A dominates B or B dominates A, but not both. In terms of a directed 
graph in which p i — » p ■ means p i dominates Pj, this means that for all distinct pairs, either P i 
both. In general, we have the following definition. 



-^ or ^ 



-» P i9 but not 



DEFINITION 



A dominance-directed graph is a directed graph such that for any distinct pair of vertices p i and Pp either P i 
P. — * R-, but not both. 



■^■or 



An example of a directed graph satisfying this definition is a league of n sports teams that play each other exactly one time, as in 
one round of a round-robin tournament in which no ties are allowed. If P i — > P, means that team p i beat team p, in their single 
match, it is easy to see that the definition of a dominance-directed group is satisfied. For this reason, dominance-directed graphs 
are sometimes called tournaments. 



Figure 1 1.7.12 illustrates some dominance-directed graphs with three, four, and five vertices, respectively. In these three graphs, 
the circled vertices have the following interesting property: from each one there is either a 1-step or a 2-step connection to any 
other vertex in its graph. In a sports tournament, these vertices would correspond to the most "powerful" teams in the sense that 
these teams either beat any given team or beat some other team that beat the given team. We can now state and prove a theorem 
that guarantees that any dominance-directed graph has at least one vertex with this property. 




M 




(A) 




Figure 11.7.12 



THEOREM 11.7.3 



Connections in Dominance-Directed Graphs 

In any dominance-directed graph, there is at least one vertex from which there is a 1-step or 2 -step connection to any other 
vertex. 



Proof Consider a vertex (there may be several) with the largest total number of 1-step and 2-step connections to other vertices in 
the graph. By renumbering the vertices, we may assume that j p 1 is such a vertex. Suppose there is some vertex p i such that there is 

no 1-step or 2-step connection from j p 1 to p.. Then, in particular, j p 1 » p i is not true, so that by definition of a 

dominance-directed graph, it must be that p i y p^ Next, let p k be any vertex such that j p 1 > p k is true. Then we cannot have 

p k » p v as then j p 1 > p k > p i would be a 2-step connection from j p 1 to p r Thus, it must be that p i » p^. That is, p i 

has 1-step connections to all the vertices to which j p 1 has 1-step connections. The vertex p i must then also have 2-step connections 

to all the vertices to which j p 1 has 2-step connections. But because, in addition, we have that p i » p^ this means that p^ has 

more 1-step and 2-step connections to other vertices than does p^. However, this contradicts the way in which j p 1 was chosen. 
Hence, there can be no vertex p^ to which j p 1 has no 1-step or 2-step connection. 



This proof shows that a vertex with the largest total number of 1-step and 2-step connections to other vertices has the property 



stated in the theorem. There is a simple way of finding such vertices using the vertex matrix M and its square M 2 - The sum of the 
entries in the /th row of M is the total number of 1-step connections from p i to other vertices, and the sum of the entries of the /th 
row of M 2 is the total number of 2- step connections from p t to other vertices. Consequently, the sum of the entries of the /th row 
of the matrix A — M I M 2 is the total number of 1-step and 2-step connections from p i to other vertices. In other words, a row of 
A = M I M 2 with the largest row sum identifies a vertex having the property stated in Theorem 1 1 .7.3. 



EXAMPLE 7 Using Theorem 1 1 .7.3 



Suppose that five baseball teams play each other exactly once, and the results are as indicated in the dominance-directed graph of 
Figure 1 1.7.13. The vertex matrix of the graph is 



M = 









1 


1 





1 





1 





1 











1 








1 











1 





1 


1 






so 




Figure 11.7.13 
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1 





1 





1 





1 
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1 


1 


2 


3 






A = M + M* = 



The row sums of A are 

1 st row sum = 4 

2nd row sum = 9 

3rd row sum = 2 

4th row sum = 4 

5th row sum = 7 
Because the second row has the largest row sum, the vertex p 2 must have a 1-step or 2-step connection to any other vertex. This is 
easily verified from Figure 1 1 .7. 13. 



We have informally suggested that a vertex with the largest number of 1-step and 2-step connections to other vertices is a 
"powerful" vertex. We can formalize this concept with the following definition. 



The power of a vertex of a dominance-directed graph is the total number of 1-step and 2-step connections from it to other 
vertices. Alternatively, the power of a vertex p i is the sum of the entries of the /th row of the matrix A — M I M 2 > where M is 
the vertex matrix of the directed graph. 



EXAMPLE 8 Example 7 Revisited 



Let us rank the five baseball teams in Example 7 according to their powers. From the calculations for the row sums in that 
example, we have 

Power of team Pi =4 

Power of team P2 = 9 

Power of team P3 = 2 

Power of team P4 = 4 

Power of team P$ = 7 

Hence, the ranking of the teams according to their powers would be 

P7 (first) , P^ (second) , P] and P4 (tied for third) , P^ (last) 



Exercise Set 1 1 .7 



Click here for Just Ask! 



Construct the vertex matrix for each of the directed graphs illustrated in the accompanying figure. 
1. 




w 




{*) 




ffir) 

Figure Ex-1 



2. 



Draw a diagram of the directed graph corresponding to each of the following vertex matrices. 



(a) 






1 


1 


o~ 


1 




















1 


1 





1 






(b) 









1 





o" 


1 











1 





1 





1 


1 

















1 


1 


1 









(c) 



10 10 1 

10 10 



110 10 

10 1 

10 10 



3. 



Let M be the following vertex matrix of a directed graph: 






1 


1 


f 


1 














1 





1 





1 


1 






(a) Draw a diagram of the directed graph. 



(b) Use Theorem 1 1.7.1 to find the number of 1-, 2-, and 3-step connections from the vertex j p 1 to the vertex p 2 . Verify 
your answer by listing the various connections as in Example 3. 



(c) Repeat part (b) for the 1-, 2-, and 3-step connections from j p 1 to p^. 



4. 



(a) Compute the matrix product M T M f° r the vertex matrix M in Example 1 . 



(b) Verify that the kth diagonal entry of M T M the number of family members who influence the Ml family member. Why 
is this true? 



(c) Find a similar interpretation for the values of the non diagonal entries of M T M- 



5. 



By inspection, locate all cliques in each of the directed graphs illustrated in the accompanying figure. 

/>, P 2 A 




r, 



Pi 

v 

.ft „ 

<— P- — II — *— + 

i\ 



*>* * 



{} — — ~7K — — ' 

i ^ < » — ^ 

nNt if jT \ 
v \^ f i 

• > TP < > » — I 



r. 



% 
</>) 



>'t >'* P $ 

ffi 



6. 



For each of the following vertex matrices, use Theorem 1 1.7.2 to find all cliques in the corresponding directed graphs. 



(a) 






1 





1 


o" 


1 





1 





1 





1 





1 


1 


1 











1 


1 





1 


1 






(b) 



10 110 
10 10 11 
10 10 1 
10 10 11 
10 10 
1110 



For the dominance-directed graph illustrated in the accompanying figure, construct the vertex matrix and find the power of each 
7. vertex. 




Figure Ex-7 

Five baseball teams play each other one time with the following results: 

Abeats 8, C,D 
5 beats C,E 

C beats A £ 
D beats 3 
EbeatsA,D 

Rank the five baseball teams in accordance with the powers of the vertices they correspond to in the dominance-directed graph 
representing the outcomes of the games. 



Section 1 1 .7 



g Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



A graph having n vertices such that every vertex is connected to every other vertex has vertex matrix given by 

1111 
10 111 
110 11 
M„ 1110 1 
11110 



11111 







In this problem we develop a formula for ^ whose (j ? j)-th entry equals the number of &-step connections from p i to P,. 



(a) Use a computer to compute the eight matrices ^ for w — 2, 3 and for £ — 2, 3, 4, 5. 



(b) Use the results in part (a) and symmetry arguments to show that ^ can be written as 



M* = 






1 


1 


1 


1 '■ 




1 





1 


1 


1 ■■ 




1 


1 





1 


1 '■ 




1 


1 


1 





1 '■ 




1 


1 


1 


1 


-• 




1 


1 


1 


1 


1 .. 


- 



ttfc 


■h 


■h 


■h 


Pk ■ 


- 3k 


■h 


<*k 


■h 


■h 


,3k ■ 


- Pk 


h 


■h 


<*k 


■h 


,3k ■ 


■■ ,3k 


■h 


■h 


■h 


<*k 


,3k ■ 


- ,3k 


•h 


■h 


■h 


■h 


&k • 


- ,3k 


h 


■h 


■h 


■h. 


,3k ■ 


- &k 



(c) Using the fact that ^ _ ^ M k ~^ ' s h° w ^at 



H — iKt H m H 



with 



t*k 







n — 


•1 


<*k-l 


,3 k _ 




_1 n-2_ 


■h-i 




■Qf 




r°i 








01 




i 





(d) Using part (c), show that 



~&k~ 




"0 n-\ 


k-l 


"0" 


J3k_ 




\ «-2_ 




_1_ 



(e) Use the methods of Section 7.2 to compute 



n-\ 

1 n-2 



lk-1 



and thereby obtain expressions for a k and g k , and eventually show that 



M k ( {n-\) k -{-\) k \ 

i 



?i 



U n I (-!)*/„ 



where u n is the ^xw matrix all of whose entries are ones and / H is the wx « identity matrix, 
(f) Show that for ^ > 2, all vertices for these directed graphs belong to cliques. 



Consider a round-robin tournament among n players (labeled a i, a-^ fl 3> • • •» a n) where a \ beats a-^ £ 2 beats a^, a^ beats a^ 
T2. . . ., flH _ 1 beats a H , and a H beats a\. Compute the "power" of each player, showing that they all have the same power; then 
determine that common power. 



Hint Use a computer to study the cases n = 3, 4, 5, 6; then make a conjecture and prove your conjecture to be true. 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



11.8 

GAMES OF STRATEGY 



In this section we discuss a general game in which two competing players 
choose separate strategies to reach opposing objectives. The optimal strategy 
of each player is found in certain cases with the use of matrix techniques. 



Prerequisites: Matrix Multiplication 

Basic Probability Concepts 



Game Theory 

To introduce the basic concepts in the theory of games, we will consider the following carnival-type game that two people agree to 
play. We will call the participants in the game player R and player C. Each player has a stationary wheel with a movable pointer on 
it as in Figure 1 1.8.1. For reasons that will become clear, we will call player R's wheel the row-wheel and player Cs wheel the 
column-wheel. The row-wheel is divided into three sectors numbered 1, 2, and 3, and the column- wheel is divided into four sectors 
numbered 1, 2, 3, and 4. The fractions of the area occupied by the various sectors are indicated in the figure. To play the game, 
each player spins the pointer of his or her wheel and lets it come to rest at random. The number of the sector in which each pointer 
comes to rest is called the move of that player. Thus, player R has three possible moves and player C has four possible moves. 
Depending on the move each player makes, player C then makes a payment of money to player R according to Table 1. 



'VT^I/f, 
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Column-wh td 

Figure 11.8.1 



Table 1 



Payment to Player R 



Player Cs Move 





Player C's Move 


1 


2 


3 


4 


Player R's Move 


1 


$3 


$5 


-$2 


-$1 


2 


-$2 


$4 


-$3 


-$4 


3 


$6 


-$5 


$0 


$3 



For example, if the row- wheel pointer comes to rest in sector 1 (player R makes move 1), and the column- wheel pointer comes to 
rest in sector 2 (player C makes move 2), then player C must pay player R the sum of $5. Some of the entries in this table are 
negative, indicating that player C makes a negative payment to player R. By this we mean that player R makes a positive payment 
to player C. For example, if the row-wheel shows 2 and the column- wheel shows 4, then player R pays player C the sum of $4, 
because the corresponding entry in the table is — £4. In this way the positive entries of the table are the gains of player R and the 
losses of player C, and the negative entries are the gains of player C and the losses of player R. 

In this game the players have no control over their moves; each move is determined by chance. However, if each player can decide 
whether he or she wants to play, then each would want to know how much he or she can expect to win or lose over the long term if 
he or she chooses to play. (Later in the section we will discuss this question and also consider a more complicated situation in 
which the players can exercise some control over their moves by varying the sectors of their wheels.) 

Two-Person Zero-Sum Matrix Games 

The game described above is an example of a two-person zero-sum matrix game. The term zero-sum means that in each play of 
the game, the positive gain of one player is equal to the negative gain (loss) of the other player. That is, the sum of the two gains is 
zero. The term matrix game is used to describe a two-person game in which each player has only a finite number of moves, so that 
all possible outcomes of each play, and the corresponding gains of the players, may be displayed in tabular or matrix form, as in 
Table 1. 

In a general game of this type, let player R have m possible moves and let player C have n possible moves. In a play of the game, 
each player makes one of his or her possible moves, and then a payoff Is made from player C to player /?, depending on the moves. 
For; = 1, 2, ..., m, and j = 1, 2, ..., n, let us set 

fly = payoff that player C makes to player R if player R 

makes move i and player C makes move j 

This payoff need not be money; it may be any type of commodity to which we can attach a numerical value. As before, if an entry 
fly is negative, we mean that player C receives a payoff of |fly | from player R. We arrange these m ^ possible payoffs in the form of 
an m x n matrix 



,4 = 



an 


«12 " 


" fl lH 


021 


^22 " 


" ^2k 


a m\ 


a m2 m 


' a mn 



which we will call the payoff matrix of the game. 



Each player is to make his or her moves on a probabilistic basis. For example, for the game discussed in the introduction, the ratio 
of the area of a sector to the area of the wheel would be the probability that the player makes the move corresponding to that 

sector. Thus, from Figure 1 1.8.1, we see that player R would make move 2 with probability ^-, and player C would make move 2 

with probability j-. In the general case we make the following definitions: 

Pi = probability that player R makes move; (i = 1 ? 2, ..., m) 
q< = probability that player C makes move j (j = 1 , 2, . . ., ft) 



It follows from these definitions that 

and 

With the probabilities pj and q^ we form two vectors: 

P=[^l P2 '■' Pm] and 1 = 



Pl+P2 + - + Pm=^ 
q\ I qi + ~ + q n = 1 



q\ 
q2 



We call the row vector/) the strategy of player R and the column vector q the strategy of player C. For example, from Figure 
11.8.1 we have 

4 
1 
4 
1 
3 
1 



P=[| | i] -4 I- 



for the carnival game described earlier. 

From the theory of probability, if the probability that player R makes move i is p i9 and independently the probability that player C 
makes move j is q^ 9 then ptfj is the probability that for any one play of the game, player R makes move i and player C makes 
move j. The payoff to player R for such a pair of moves is fly. If we multiply each possible payoff by its corresponding probability 
and sum over all possible payoffs, we obtain the expression 



attPm+ai2Pm+'" + ainPiqn+a2lP2qi+'" + a m nP7nqy 



(1) 



Equation 1 is a weighted average of the payoffs to player R; each payoff is weighted according to the probability of its occurrence. 
In the theory of probability, this weighted average is called the expected payoff to player R. It can be shown that if the game is 
played many times, the long-term average payoff per play to player R is given by this expression. We denote this expected payoff 
by E(-p, q) to emphasize the fact that it depends on the strategies of the two players. From the definition of the payoff matrix A and 
the strategies p and q, it can be verified that we may express the expected payoff in matrix notation as 



£(p>q) = [^1 ^2 



Pm] 



an 

021 



^12 
^22 



a ml a m2 



fll„" 


"tfl" 


<32h 


£2 


a mn 


#H 



= vM 



(2) 



Because E(p, q) is the expected payoff to player R, it follows that _ E(y, q) is the expected payoff to player C. 



EXAMPLE 1 Expected Payoff to Player R 



For the carnival game described earlier, we have 



£(p,q)= pJ 4q = 



ill 
6 3 2 



3 5 
2 4 
6 -5 



2 
3 






V 




4 


-1] 


1 




4 


-4 






1 


3 


3 




1 




6 



=t!= 1805 - 



Thus, in the long run, player R can expect to receive an average of about 18 cents from player C in each play of the game. 

So far we have been discussing the situation in which each player has a predetermined strategy. We will now consider the more 
difficult situation in which both players may change their strategies independently. For example, in the game described in the 
introduction, we would allow both players to alter the areas of the sectors of their wheels and thereby control the probabilities of 
their respective moves. This qualitatively changes the nature of the problem and puts us firmly in the field of true game theory. It is 
understood that neither player knows what strategy the other will choose. It is also assumed that each player will make the best 
possible choice of strategy and that the other player knows this. Thus, player R attempts to choose a strategy p such that E(p, q) is 
as large as possible for the best strategy q that player C can choose; and similarly, player C attempts to choose a strategy q such 
that E(\* ? q) is as small as possible for the best strategy p that player R can choose. To see that such choices are actually possible, 
we shall need the following theorem, called the Fundamental Theorem of Two-Person Zero-Sum Games. (The general proof, 
which involves ideas from the theory of linear programming, will be omitted. However, later we shall prove two special cases of 
the theorem.) 

THEOREM 11.8.1 



Fundamental Theorem of Zero-Sum Games 




There exist strategies p * and q * such that 




£(p*,q)>£(p*,q*)>5tp,q*) 


(3) 


for all strategies p and q. 





The strategies p * and q * in this theorem are the best possible strategies for players R and C, respectively. To see why this is so, 
let v = E{\} * , q * )• The left-hand inequality of Equation 3 then reads 

E{\% * , q) > v for all strategies q 

This means that if player R chooses the strategy p * , then no matter what strategy q player C chooses, the expected payoff to 
player R will never be below v. Moreover, it is not possible for player R to achieve an expected payoff greater than v. To see why, 
suppose there is some strategy p * * that player R can choose such that 

E{\} * * , q) > v for all strategies q 

Then, in particular, 

E{\* * * , q) > v 

But this contradicts the right-hand inequality of Equation 3, which requires that v > E{\% * * , q * )• Consequently, the best player 
R can do is prevent his or her expected payoff from falling below the value v. Similarly, the best player C can do is ensure that 
player Rs expected payoff does not exceed v, and this can be achieved by using strategy q * . 

On the basis of this discussion, we arrive at the following definitions. 



DEFINITION 



If p * and q * are strategies such that 

£^q)>£(p*,q*)>£(p,q*) (4) 

for all strategies p and q, then 



(i) p * is called an optimal strategy for player R. 



(ii) q * is called an optimal strategy for player C. 



(iii) y = i?(p* ? q*)is called the value of the game. 



The wording in this definition suggests that optimal strategies are not necessarily unique. This is indeed the case, and in Exercise 2 
we ask the reader to show this. However, it can be proved that any two sets of optimal strategies always result in the same value v 



of the game. That is, if p * , q * and p 



# # n # # 



are optimal strategies, then 



5(p*,q*)=5(p* * ? q* *) 

The value of a game is thus the expected payoff to player R when both players choose any possible optimal strategies. 

To find optimal strategies, we must find vectors p * and q * that satisfy Equation 4. This is generally done by using linear 
programming techniques. Next, we discuss special cases for which optimal strategies may be found by more elementary 
techniques. 

We now introduce the following definition. 



(5) 



DEFINITION 



An entry a rs in a payoff matrix A is called a saddle point if 



(i) a rz is the smallest entry in its row, and 



(ii) a rs is the largest entry in its column. 
A game whose payoff matrix has a saddle point is called strictly determined. 



For example, the shaded element in each of the following payoff matrices is a saddle point: 



3 1 
-4 



30 -50 -5 

60 90 75 

-10 60 -30 



0-3 5-9 

15 -8 -2 10 

7 10 6 9 

6 11 -3 2 



If a matrix has a saddle point a r5 ,it turns out that the following strategies are optimal strategies for the two players: 



p* = [0 -■ ■ 1 - ■ 0], q* = 
/ 

rlh l-1 try 



■ sth entry 



That is, an optimal strategy for player R is to always make the rth move, and an optimal strategy for player C is to always make the 
5th move. Such strategies for which only one move is possible are called pure strategies. Strategies for which more than one move 



is possible are called mixed strategies. To show that the above pure strategies are optimal, the reader may verify the following 
three equations (see Exercise 6): 

5(p * , q * ) = p * j4q * = a rs 



(6) 



E(v *,!)=!>* -dq -^ &r5 f° r any strategy q 



(7) 



^(P> q * ) = P^q * <^r5 for ^y strategy p . g . 

Together, these three equations imply that 

5(p*,q)>£(p*,q*)>5(p,q*) 
for all strategies/; and q. Because this is exactly Equation 4, it follows that p * and q * are optimal strategies. 

From Equation 6 the value of a strictly determined game is simply the numerical value of a saddle point a rs . It is possible for a 
payoff matrix to have several saddle points, but then the uniqueness of the value of a game guarantees that the numerical values of 
all saddle points are the same. 



EXAMPLE 2 Optimal Strategies to Maximize a Viewing Audience 

Two competing television networks, R and C, are scheduling one-hour programs in the same time period. Network R can schedule 
one of three possible programs, and network C can schedule one of four possible programs. Neither network knows which 
program the other will schedule. Both networks ask the same outside polling agency to give them an estimate of how all possible 
pairings of the programs will divide the viewing audience. The agency gives them each Table 2, whose (j ? jf)-th entry is the 
percentage of the viewing audience that will watch network R if network 7?'s program i is paired against network Cs program j. 
What program should each network schedule in order to maximize its viewing audience? 



Table 2 



Audience Percentage for Network R 



Network Cs Program 







1 


2 


3 


4 


Network R's Program 


1 


60 


20 


30 


55 


2 


50 


75 


45 


60 


3 


70 


45 


35 


30 



Solution 



Subtract 50 from each entry in Table 2 to construct the following matrix: 

10 -30 -20 5 

25-5 10 

20 -5-15-20 



This is the payoff matrix of the two-person zero-sum game in which each network is considered to start with 50% of the audience, 
and the (j ? jf)-th entry of the matrix is the percentage of the viewing audience that network C loses to network R if programs i and j 
are paired against each other. It is easy to see that the entry 

023 = - 5 
is a saddle point of the payoff matrix. Hence, the optimal strategy of network R is to schedule program 2, and the optimal strategy 
of network C is to schedule program 3. This will result in network R's receiving 45% of the audience and network C's receiving 
55% of the audience. 

2x2 Matrix Games 

Another case in which the optimal strategies can be found by elementary means occurs when each player has only two possible 
moves. In this case, the payoff matrix is a 2 x 2 matrix 

"an 012~ 



,4 = 



021 fl 22 



If the game is strictly determined, at least one of the four entries of A is a saddle point, and the techniques discussed above can then 
be applied to determine optimal strategies for the two players. If the game is not strictly determined, we first compute the expected 
payoff for arbitrary strategies/? and q: 

02 



£(p>q)=p^q= [pi pi\ 

= a\\Pm+a\2P\Q2 I 021^221 I 022^202 



an 0i2 
021 022 



(9) 



Because 



P\+P2=l and £1 I £2= 1 



we may substitute p 2 = 1 — p j and 22 = 1 — <? 1 mt0 9 to obtain 

E($,q)=a\\p\q\ I ct n p\{\-qi) I a 2 \{\-p\)q\ I a 2 2(l -J>i)(l -q\) 

If we rearrange the terms in Equation 1 1 , we may write 

£(P, q)=[(ail I ^22-fll2-^2l)/ , l-C«22-^2l)]'?l I {a\2-^2l)P\ I ^22 

By examining the coefficient of the q \ term in 12, we see that if we set 



P\=P\ 



322-^21 



flu I A22 -312-^21 



then that coefficient is zero, and 12 reduces to 



3ii I 322-312-^21 



(10) 



(11) 



(12) 



(13) 



(14) 



Equation 14 is independent of q; that is, if player R chooses the strategy determined by 13, player C cannot change the expected 
payoff by varying his or her strategy. 



In a similar manner, it may be verified that if player C chooses the strategy determined by 



then substituting in 12 gives 



Equations 14 and 16 show that 



q\=Q\ = : — ^ Li 

*1 «11 +322-312-^21 



£(p ; q * ) = 311322-312321 



311+322-312-A21 



£(p*,q)=S(p*,q*)=£(l),q*) 



(15) 



(16) 



(17) 



for all strategies p and q. Thus, the strategies determined by 13, 15, and 10 are optimal strategies for players R and C, respectively, 
and so we have the following result. 



THEOREM 11.8.2 



Optimal Strategies for a 2 x 2 Matrix Game 

For a 2x2 game that is not strictly determined, optimal strategies for players R and C are 



p * = 



a 22~ a 21 



a ll~ a 12 



#11+^22-^12-^21 ^ 1^22-^12-^21 



and 



q * = 




a72~an 


a n - 


f fl 2 2-^12-«21 


a n - 


f fl 2 2-«12-«21 



The value of the game is 



V = 



#ll+#22-#12-#21 



In order to be complete, we must show that the entries in the vectors p * and q * are numbers strictly between and 1 . In Exercise 
8 we ask the reader to show that this is the case as long as the game is not strictly determined. 

Equation 17 is interesting in that it implies that either player may force the expected payoff to be the value of the game by choosing 
his or her optimal strategy, regardless of which strategy the other player chooses. This is not true, in general, for games in which 
either player has more than two moves. 



EXAMPLE 3 Using Theorem 1 1 .8.2 



The federal government desires to inoculate its citizens against a certain flu virus. The virus has two strains, and the proportions in 
which the two strains occur in the virus population is not known. Two vaccines have been developed. Vaccine 1 is 85% effective 
against strain 1 and 70% effective against strain 2. Vaccine 2 is 60% effective against strain 1 and 90% effective against strain 2. 
What inoculation policy should the government adopt? 



Solution 

We may consider this a two-person game in which player R (the government) desires to make the payoff (the fraction of citizens 
resistant to the virus) as large as possible, and player C (the virus) desires to make the payoff as small as possible. The payoff 
matrix is 

Strain 
1 2 

V:,OT " e 2 L .91. 



This matrix has no saddle points, so Theorem 11.8.2 is applicable. Consequently, 



*_ a-n-m\ _ .90 -.60 _ .30 _ 2 

P{ " <*n I fl22-<=<l2-fl2l .85 I .90 -.70 -.60 " .45 " 3 

«* 1 »* 1 2 ! 



3 3 



.90 -.70 



.20 4 



* ail — a\i 

q i ~'~~ an | 322-^12-^21 : : .85 + .90 -.70 -.60 ~ .45 _ 9 

^=1-^=1--=- 

anflaa-fli^i (.S5)(90) - (70)(60) ^345. = 7fififi 

" flu I a 22 -fli2-«21 ' .85 I .90 -.70 -.60 .45 /uuu --- 

O 1 

Thus, the optimal strategy for the government is to inoculate ^ of the citizens with vaccine 1 and i- of the citizens with vaccine 2. 

This will guarantee that about 76.7% of the citizens will be resistant to a virus attack regardless of the distribution of the two 
strains. 

In contrast, a virus distribution of ^ of strain 1 and ^ of strain 2 will result in the same 76.7% of resistant citizens, regardless of 

9 9 

the inoculation strategy adopted by the government (see Exercise 8). 



Exercise Set 1 1 .8 



@ 



Click here for Just Ask! 



1. 



Suppose that a game has a payoff matrix 



,4 = 



-4 6-4 1 

5-733 

-8 6-2 



(a) If players R and C use strategies 



P = 



1 I 

2 2 



and q = 



respectively, what is the expected payoff of the game? 



(b) If player C keeps his strategy fixed as in part (a), what strategy should player R choose to maximize his expected 
payoff? 

(c) If player R keeps her strategy fixed as in part (a), what strategy should player C choose to minimize the expected payoff 
to player Rl 



Construct a simple example to show that optimal strategies are not necessarily unique. For example, find a payoff matrix with 
2. several equal saddle points. 



For the strictly determined games with the following payoff matrices, find optimal strategies for the two players, and find the 
3. values of the games. 



(a) 



(b) 



(c) 



(d) 



5 2 






7 3_ 






-3 


-2" 




2 


4 




-4 


1 




2 


-2 





-6 





-5 


5 


2 


3 


-3 


2 


-1 


-2 


-1 


5 


-4 


1 





-3 


4 


6 



4. 



For the 2 x 2 games with the following payoff matrices, find optimal strategies for the two players, and find the values of the 
games. 



(a) 



6 3 
-1 4 



(b) 



40 20 
-10 30 



(c) 



3 7 
-5 4 



(d) 



3 5 
5 2 



(e) 



7 -3 
-5 -2 



Player /? has two playing cards: a black ace and a red four. Player C also has two cards: a black two and a red three. Each player 
5. secretly selects one of his or her cards. If both selected cards are the same color, player C pays player R the sum of the face 
values in dollars. If the cards are different colors, player R pays player C the sum of the face values. What are optimal strategies 
for both players, and what is the value of the game? 



6. 



Verify Equations 6, 7, and 8. 



7. 



Verify the statement in the last paragraph of Example 3. 



8. 



Show that the entries of the optimal strategies p * and q * given in Theorem 1 1.8.2 are numbers strictly between zero and one. 



Section 1 1 .8 



® 



Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematical Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



Consider a game between two players where each player can make up to n different moves ( M > 1 ) . If the /th move of player 
R and the jth move of player C are such that i + j is even, then C pays R $1. If i I j is odd, then R pays C$1. Assume that 
both players have the same strategy— that is, Ph = [ Pi ] ljm and q H = [>j ] Hxl , where p{ + p2 + p3 + ... + p ^ = ]. Use a 
computer to show that 

£(V2, %2) = (PI ~ P2) 2 
£(V3, < \3) = (PI-P2 + P3) 2 
£(P4 q-4) = (PI -P2 + P3- P4) 2 
^(P J. qj) = (PI ~ P2 + P3 - P4 + P5) 2 



Using these results as a guide, prove in general that the expected payoff to player R is 

5(p H ,q M ) = fE(-iy' + S] >o 
which shows that in the long run, player R will not lose in this game. 



Consider a game between two players where each player can make up to n different moves (# > 1 ) . If both players make th 
T2. same move, then player C pays player R $ (h — 1). However, if both players make different moves, then player R pays pla; 
C$1. Assume that both players have the same strategy — that is, p H == [ Pi ] ^ and q H = [ Pi ] ^ where 
pj + £2 4- /?3 H h p H = 1- Use a computer to show that 



£(P2, 12) = \{P\-P\) 2 + \(P\ -P2) 2 + \{P2-P\) 2 + \{P2 -Pi) 2 

E(\n,<vi) = \(p\-p\) 2 + \(p\-P2) 2 1 \{p\-pi) 2 

I \(P2-P\) 2 +\(P2-P2) 2 +\(P2-Pi) 2 
+ \(P1-P\) 2 +\(P2-P2) 2 +\(P2-Pi) 2 
5(P4> Q4) = ^Ol -^l) 2 + ^"Ol -^2) 2 + ^Ol -«) 2 + \(.P\ -PA) 2 

+ \(P2-P\) 2 + \{P2-P2) 2 + \{P2-Pi) 2 + \{P2 ~ PA) 2 

I \{PZ -Pi) 2 + \to-p2) 2 + ^0>3 -P3) 2 + \{P2 ~ PA) 2 
+ \(PA-P\) 2 +\(PA-P2) 2 +\(PA-P3) 2 +\(PA-PA) 2 



Using these results as a guide, prove in general that the expected payoff to player R is 

E(Vn,1 n )=^tt(Pi-Pj) 2 >0 

which shows that in the long run, player R will not lose in this game. 
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11.9 

LEONTIEF ECONOMIC 
MODELS 



In this section we discuss two linear models for economic systems. Some 
results about nonnegative matrices are applied to determine equilibrium price 
structures and outputs necessary to satisfy demand. 



Prerequisites: Linear Systems 
Matrices 



Economic Systems 

Matrix theory has been very successful in describing the interrelations among prices, outputs, and demands in economic systems. 
In this section we discuss some simple models based on the ideas of Nobel laureate Wassily Leontief . We examine two different 
but related models: the closed or input-output model, and the open or production model. In each, we are given certain economic 
parameters that describe the interrelations between the "industries" in the economy under consideration. Using matrix theory, we 
then evaluate certain other parameters, such as prices or output levels, in order to satisfy a desired economic objective. We begin 
with the closed model. 

Leontief Closed (Input-Output) Model 

First we present a simple example; then we proceed to the general theory of the model. 



EXAMPLE 1 An Input-Output Model 



Three homeowners — a carpenter, an electrician, and a plumber — agree to make repairs in their three homes. They agree to work a 
total of 10 days each according to the following schedule: 



Work Performed by 



Carpenter Electrician Plumber 



Days of Work in Home of Carpenter 



Days of Work in Home of Electrician 



1 



4 



Days of Work in Home of Plumber 4 

For tax purposes, they must report and pay each other a reasonable daily wage, even for the work each does on his or her own 
home. Their normal daily wages are about $100, but they agree to adjust their respective daily wages so that each homeowner will 
come out even — that is, so that the total amount paid out by each is the same as the total amount each receives. We can set 

p\ = daily wage of carpenter 

P2 = daily wage of electrician 

P3 = daily wage of plumber 

To satisfy the "equilibrium" condition that each homeowner comes out even, we require that 



total expenditures = total income 

for each of the homeowners for the 10-day period. For example, the carpenter pays a total of 2^1 + ^2 + 6^3 f° r the repairs in his 
own home and receives a total income of 10j?i f° r the repairs that he performs on all three homes. Equating these two expressions 
then gives the first of the following three equations: 

2^i+^2 + 6^ 3 = 10^i 

4pi +5^2 I ^3= 10^2 

Ap\ l 4z?2 I 3pi = 10^?3 

The remaining two equations are the equilibrium equations for the electrician and the plumber. Dividing these equations by 10 and 
rewriting them in matrix form yields 



.1 
.5 
.4 



6" 


>l" 




'P\~ 


1 


P2 


= 


P2 


3 


PI 




P3 



(1) 



Equation 1 can be rewritten as a homogeneous system by subtracting the left side from the right side to obtain 

.8 -.1 _.6~" 
-.4 .5 -.1 
-.4 -.4 .7 

The solution of this homogeneous system is found to be (verify) 



\p\~ 




"0" 


P2 


= 





P3 








'Pi' 




"31" 


P2 


= s 


32 


P2 




36 



where s is an arbitrary constant. This constant is a scale factor, which the homeowners may choose for their convenience. For 
example, they may set s = 3 so that the corresponding daily wages — $93, $96, and $108 — are about $100. 

4 

This example illustrates the salient features of the Leontief input-output model of a closed economy. In the basic Equation 1, each 
column sum of the coefficient matrix is 1, corresponding to the fact that each of the homeowners' "output" of labor is completely 
distributed among these same homeowners in the proportions given by the entries in the column. Our problem is to determine 
suitable "prices" for these outputs so as to put the system in equilibrium — that is, so that each homeowner's total expenditures 
equal his or her total income. 

In the general model we have an economic system consisting of a finite number of "industries," which we number as industries 1, 
2, . . ., k. Over some fixed period of time, each industry produces an "output" of some good or service that is completely utilized in 
a predetermined manner by the k industries. An important problem is to find suitable "prices" to be charged for these k outputs so 
that for each industry, total expenditures equal total income. Such a price structure represents an equilibrium position for the 
economy. 

For the fixed time period in question, let us set 

p i = price charged by the /th industry for its total output 

&ij = fraction of the total output of the jth industry purchased by the /th industry 

for i, j = 1, 2, ..., Jt. By definition, we have 
(i) Pi>0, i = l, 2 A 



(ii) ay >0, i,j, =l,2,...,k 



(iii) eij I e 2j . + .« + e A . J . = l, j=l,2,...,k 
With these quantities, we form the price vector 



p\ 

P2 



and the exchange matrix or input-output matrix 



E = 



Pk 



*11 s \2 L " ^lft 
021 ^22 - e 2 k 



Condition (iii) expresses the fact that all the column sums of the exchange matrix are 1. 

As in the example, in order that the expenditures of each industry be equal to its income, the following matrix equation must be 
satisfied [see 1]: 



£p = p 



(2) 



or 



(I-E)v = Q 



(3) 



Equation 3 is a homogeneous linear system for the price vector p. It will have a nontrivial solution if and only if the determinant of 
its coefficient matrix / — £ is zero. In Exercise 7 we ask the reader to show that this is the case for any exchange matrix E. Thus, 3 
always has nontrivial solutions for the price vector/;. 

Actually, for our economic model to make sense, we need more than just the fact that 3 has nontrivial solutions for/7. We also need 
the prices p$ of the k outputs to be nonnegative numbers. We express this condition as p > 0. (In general, if A is any vector or 
matrix, the notation ^4 > means that every entry of A is nonnegative, and the notation A > Q means that every entry of A is 
positive. Similarly, A > B means A — B > 0, and A > B means A — B> QO.) To show that 3 has a nontrivial solution for which p > 
is a bit more difficult than showing merely that some nontrivial solution exists. But it is true, and we state this fact without proof in 
the following theorem. 



THEOREM 11.9.1 



IfE is an exchange matrix, then B\% = p always has a nontrivial solution p whose entries are nonnegative. 



Let us consider a few simple examples of this theorem. 



EXAMPLE 2 Using Theorem 1 1 .9.1 



Let 



Then (/-£)p=0is 



E = 



which has the general solution 



1 1 



[ ' °1 


~p\~ 




"0" 


.-' °. 


P2 




_0_ 



p=s 



where s is an arbitrary constant. We then have nontrivial solutions p > for any s > 0- 



EXAMPLE 3 Using Theorem 1 1 .9.1 



Let 



Then (7 — E)y = has the general solution 



E = 



V=s 



1 
1 



"f 


+ £ 


"0" 


|_ u J 




|_lj 



where s and t are independent arbitrary constants. Nontrivial solutions p > then result from any s > and t > 0, not both zero. 

Example 2 indicates that in some situations one of the prices must be zero in order to satisfy the equilibrium condition. Example 3 
indicates that there may be several linearly independent price structures available. Neither of these situations describes a truly 
interdependent economic structure. The following theorem gives sufficient conditions for both cases to be excluded. 

THEOREM 11.9.2 



Let E be an exchange matrix such that for some positive integer m all the entries ofB m are positive. Then there is exactly one 
linearly independent solution of (I — E)\* = 0, and it may be chosen so that all its entries are positive. 



We will not give a proof of this theorem. The reader who has read Section 1 1.6 on Markov chains may observe that this theorem is 
essentially the same as Theorem 1 1.6.4. What we are calling exchange matrices in this section were called stochastic or Markov 
matrices in Section 11.6. 



EXAMPLE 4 Using Theorem 1 1 .9.2 



The exchange matrix in Example 1 was 



B = 



.2 


.1 


.6" 


.4 


.5 


.1 


.4 


.4 


.3 



Because E > Q> the condition g m > fj in Theorem 1 1.9.2 is satisfied for m = 1 . Consequently, we are guaranteed that there is 
exactly one linearly independent solution of (J — E)y = , and it can be chosen so that g > Q . In that example, we found that 

'31" 



P = 



32 
36 



is such a solution. 



Leontief Open (Production) Model 



In contrast with the closed model, in which the outputs of k industries are distributed only among themselves, the open model 
attempts to satisfy an outside demand for the outputs. Portions of these outputs may still be distributed among the industries 
themselves, to keep them operating, but there is to be some excess, some net production, with which to satisfy the outside demand. 
In the closed model the outputs of the industries are fixed, and our objective is to determine prices for these outputs so that the 
equilibrium condition, that expenditures equal incomes, is satisfied. In the open model it is the prices that are fixed, and our 
objective is to determine levels of the outputs of the industries needed to satisfy the outside demand. We will measure the levels of 
the outputs in terms of their economic values using the fixed prices. To be precise, over some fixed period of time, let 

Xi = monetary value of the total output of the /th industry 

df . = monetary value of the output of the /th industry needed to satisfy the outside demand 

c 3? = monetary value of the output of the /th industry needed by the jth industry to produce one unit of monetary value of 

its own output 



With these quantities, we define the production vector 



x = 



the demand vector 



t\ = 



*1 
*2 



d 2 
dk 



and the consumption matrix 



C = 






c\k 
C2k 

Ckk 



By their nature, we have that 

x>0, d>0, and C>0 
From the definition of cy and Xj, it can be seen that the quantity 

cn*i+^2*2 + - + ^*fc 
is the value of the output of the /th industry needed by all k industries to produce a total output specified by the production vector 
x. Because this quantity is simply the /th entry of the column vector Cx> we can say further that the /th entry of the column vector 

x-Cx 
is the value of the excess output of the /th industry available to satisfy the outside demand. The value of the outside demand for the 
output of the /th industry is the /th entry of the demand vector d. Consequently, we are led to the following equation 

x - Cx = A 
or 



(I-C)x = A 

for the demand to be exactly met, without any surpluses or shortages. Thus, given C and d, our objective is to find a production 
vector x > that satisfies Equation 4. 



(4) 



EXAMPLE 5 Production Vector for a Town 



A town has three main industries: a coal-mining operation, an electric power- generating plant, and a local railroad. To mine $1 of 
coal, the mining operation must purchase $.25 of electricity to run its equipment and $.25 of transportation for its shipping needs. 
To produce $1 of electricity, the generating plant requires $.65 of coal for fuel, $.05 of its own electricity to run auxiliary 



equipment, and $.05 of transportation. To provide $1 of transportation, the railroad requires $.55 of coal for fuel and $.10 of 
electricity for its auxiliary equipment. In a certain week the coal-mining operation receives orders for $50,000 of coal from outside 
the town, and the generating plant receives orders for $25,000 of electricity from outside. There is no outside demand for the local 
railroad. How much must each of the three industries produce in that week to exactly satisfy their own demand and the outside 
demand? 



Solution 

For the one- week period let 

x 1 = value of total output of coal-mining operation 
^2 = value of total output of power- generating plant 
^3 = value of total output of local railroad 

From the information supplied, the consumption matrix of the system is 

C = 



.65 


.55" 


25 .05 


.10 


25 .05 






The linear system (I — C)x = cl is then 



1.00 -.65 -.55 
-.25 .95 -.10 
-.25 -.05 1.00 



The coefficient matrix on the left is invertible, and the solution is given by 



r* 1 " 




"50,000" 


*2 

l* 3 


— 


25,000 




x=(/-C) _1 (l = 



503 



756 542 470 " 


"50,000" 




"102,087" 


220 690 190 


25,000 


= 


56,163 


200 170 630 







28,330 



Thus, the total output of the coal-mining operation should be $102,087, the total output of the power- generating plant should be 
$56,163, and the total output of the railroad should be $28,330. 



Let us reconsider Equation 4: 

If the square matrix / _ C is invertible, we can write 



(I-C)x = d 



x=(I-C)~ l A 



(5) 



In addition, if the matrix (I —C) has only nonnegative entries, then we are guaranteed that for any d > 0, Equation 5 has a 

unique nonnegative solution for x. This is a particularly desirable situation, as it means that any outside demand can be met. The 
terminology used to describe this case is given in the following definition. 



DEFINITION 



A consumption matrix C is said to be productive if (I —C) exists and 

(/-C) _1 >0 



We will now consider some simple criteria that guarantee that a consumption matrix is productive. The first is given in the 
following theorem. 



THEOREM 11.9.3 



Productive Consumption Matrix 

A consumption matrix C is productive if and only if there is some production vector x > such that x > Cx- 



(The proof is outlined in Exercise 9.) The condition x > Cx means that there is some production schedule possible such that each 
industry produces more than it consumes. 

Theorem 1 1.9.3 has two interesting corollaries. Suppose that all the row sums of C are less than 1. If 

r 



x = 



1 



1 



then Cx is a column vector whose entries are these row sums. Therefore, x > Cx. and the condition of Theorem 1 1.9.3 is satisfied. 
Thus, we arrive at the following corollary: 

COROLLARY 11.9.4 



A consumption matrix is productive if each of its row sums is less than 1. 



As we ask the reader to show in Exercise 8, this corollary leads to the following: 
COROLLARY 11.9.5 



A consumption matrix is productive if each of its column sums is less than 1. 



Recalling the definition of the entries of the consumption matrix C, we see that the jth column sum of C is the total value of the 
outputs of all k industries needed to produce one unit of value of output of the jth industry. The jth industry is thus said to be 
profitable if that jth column sum is less than 1. In other words, Corollary 1 1.9.5 says that a consumption matrix is productive if all 
k industries in the economic system are profitable. 



EXAMPLE 6 Using Corollary 1 1 .9.5 



The consumption matrix in Example 5 was 



C = 



.65 


.55" 


25 .05 


.10 


25 .05 






All three column sums in this matrix are less than 1 and so all three industries are profitable. Consequently, by Corollary 1 1.9.5, 



-1 



the consumption matrix C is productive. This can also be seen in the calculations in Example 5, as (I — C) is nonnegative. 



@ 



Click here fey Just A sk! 



For the following exchange matrices, find nonnegative price vectors that satisfy the equilibrium condition 3. 



(a) I I 



"l 


1 " 


2 


3 


1 


2 


2 


3 



(b) 



" 1 





1 " 


2 




2 


1 





1 


3 




2 


J_ 


1 





6 







(c) 



.35 


.50 .30" 


.25 


.20 .30 


.40 


.30 .40 



Using Theorem 1 1.9.3 and its corollaries, show that each of the following consumption matrices is productive. 



(a) 



.8 .1 
.3 .6 



(b) 



70 .30 


.25" 


20 .40 


.25 


05 .15 


.25 



(c) 



.7 


.3 


.2" 


.1 


.4 


.3 


.2 


.4 


.1 



Using Theorem 1 1.9.2, show that there is only one linearly independent price vector for the closed economic system with 
3. exchange matrix 



E = 



.2 


.5 


1 .2 


.5 


.6 






Three neighbors have backyard vegetable gardens. Neighbor A grows tomatoes, neighbor B grows corn, and neighbor C grows 
• lettuce. They agree to divide their crops among themselves as follows: A gets -^ of the tomatoes, -^ of the corn, and j- of the 

lettuce. B gets ^ of the tomatoes, ^ of the corn, and -j of the lettuce. C gets jr of the tomatoes, ^ of the corn, and -^ of the 

lettuce. What prices should the neighbors assign to their respective crops if the equilibrium condition of a closed economy is to 
be satisfied, and if the lowest-priced crop is to have a price of $100? 



Three engineers — a (CE), an (EE), and a (ME) — each have a consulting firm. The consulting they do is of a multidisciplinary 
5. nature, so they buy a portion of each others' services. For each $1 of consulting the CE does, she buys $.10 of the EE's services 
and $.30 of the ME's services. For each $1 of consulting the EE does, she buys $.20 of the CE's services and $.40 of the ME's 
services. And for each $1 of consulting the ME does, she buys $.30 of the CE's services and $.40 of the EE's services. In a 
certain week the CE receives outside consulting orders of $500, the EE receives outside consulting orders of $700, and the ME 
receives outside consulting orders of $600. What dollar amount of consulting does each engineer perform in that week? 



6. 

(a) Suppose that the demand dj for the output of the /th industry increases by one unit. Explain why the /th column of the 

matrix (7 — C) _1 is the increase that must be made to the production vector x to satisfy this additional demand. 



(b) Referring to Example 5, use the result in part (a) to determine the increase in the value of the output of the coal-mining 
operation needed to satisfy a demand of one additional unit in the value of the output of the power-generating plant. 



Using the fact that the column sums of an exchange matrix E are all 1, show that the column sums of / _ E are zero. From this, 
7. show that J — E has zero determinant, and so (7 — 5)p = has nontrivial solutions for/7. 

Show that Corollary 1 1.9.5 follows from Corollary 1 1.9.4. 
8. 

— 1 T 

Hint Use the fact that (A T ) = (A _1 ) f° r an Y invertible matrix A. 



9. (For Readers Who Have Studied Calculus) Prove Theorem 1 1.9.3 as follows: 

(a) Prove the "only if " part of the theorem; that is, show that if C is a productive consumption matrix, then there is a vectc 
x > such that x > Cx- 

(b) Prove the "if " part of the theorem as follows: 



Step 1. Show that if there is a vector \ * > such that Cx * < x * > then x * > 0- 

Step 2. Show that there is a number A such that < A < 1 and Cx * < Ax * • 

Step 3. Show that C"x * < A"x * . 

Step 4. Show that C" — > as n , M • 

Step 5. By multiplying out, show that 

for K = l,2,... - 

Step 6. By letting n > -#_ in Step 5, show that the matrix infinite sum 



£ = / + C + C 2 + - 



exists and that (I-C)S=I- 

Step 7. Show that S > and that S = (1 - C) _1 . 



Step 8. Show that C is a productive consumption matrix. 



Section 1 1 .9 
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Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



Consider a sequence of exchange matrices {E2,E^, E4, E$, ..., E n ) , where 



s 2 = 



"0 


1 " 




2 


1 


1 




2 



E 3 = 






1 


1 " 




2 


3 


1 





1 
3 





1 


1 




2 


ji 



E 4 = 






1 


1 




2 


Ji 


1 





1 

3 





1 

2 











1 

3 



E 5 = 






1 


1 


1 


1 




2 


3 


4 


5 


1 


n 


1 


1 


1 






J 


4 


b 


n 


1 


n 


1 


1 




2 




4 


i 








1 
3 





1 

5 











1 
4 


1 
5 



and so on. Use a computer to show that ^2 > q ^3 ^ q ^4 > q g5 ^ q and make the conjecture that although gn ;> q is 
true, gk > q is not true for k = 1 , 2, 3, . . ., m — 1 . Next, use a computer to determine the vectors p H such that s n p n = i* n (f° r 
« = 2, 3, 4, 5, 6), and then see if you can discover a pattern that would allow you to compute p H +i easily from p H . Test your 
discovery by first constructing pg from 



P7 = 



2520 

3360 

1890 

672 

175 

36 

7 



and then checking to see whether E$p% = pg- 



T2. 



Consider an open production model having n industries with n > ]. In order to produce $1 of its own output, the 7th industi 
must spend $ (1 / «) for the output of the /th industry (for all i ^ j, but the 7th industry (for all j = 1, 2, 3, ..., m) spends 
nothing for its own output. Construct the consumption matrix q , show that it is productive, and determine an expression f 
(/„ — C H ) _1 - I n determining an expression for (/ H — C H ) _1 ? use a computer to study the cases when a = 2, 3, 4 > and 5; the 
make a conjecture and prove your conjecture to be true. 



Hint If jP = [ 1 1 (that is, the w x « matrix with every entry equal to 1), first show that 



F n _ nJ? n 



and then express your value of (i n -c»)~ l in terms of n, j n , and f f 
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11.10 

FOREST MANAGEMENT 



In this section we discuss a matrix model for the management of a forest where 
trees are grouped into classes according to height. The optimal sustainable 
yield of aperiodic harvest is calculated when the trees of different height 
classes can have different economic values. 



Prerequisites: Matrix Operations 



Optimal Sustainable Yield 

Our objective is to introduce a simplified model for the sustainable harvesting of a forest whose trees are classified by height. The 
height of a tree is assumed to determine its economic value when it is cut down and sold. Initially, there is a distribution of trees of 
various heights. The forest is then allowed to grow for a certain period of time, after which some of the trees of various heights are 
harvested. The trees left unharvested are to be of the same height configuration as the original forest, so that the harvest is 
sustainable. As we will see, there are many such sustainable harvesting procedures. We want to find one for which the total 
economic value of all the trees removed is as large as possible. This determines the optimal sustainable yield of the forest and is 
the largest yield that can be attained continually without depleting the forest. 

The Model 

Suppose that a harvester has a forest of Douglas fir trees that are to be sold as Christmas trees year after year. Every December the 
harvester cuts down some of the trees to be sold. For each tree cut down, a seedling is planted in its place. In this way the total 
number of trees in the forest is always the same. (In this simplified model, we will not take into account trees that die between 
harvests. We assume that every seedling planted survives and grows until it is harvested.) 

In the marketplace, trees of different heights have different economic values. Suppose that there are n different price classes 
corresponding to certain height intervals, as shown in Table 1 and Figure 1 1.10.1. The first class consists of seedlings with heights 
in the interval [0, k\), and these seedlings are of no economic value. The nth class consists of trees with heights greater than or 
equal to k n _ v 



Table 1 






Class 


Value (dollars) 


Height Interval 


1 (seedings) 


None 


flUi) 


2 


P2 


[h\i hj) 


3 


P2 


[&2> A3) 


: : : 


n-\ 


Pn-\ 


lJz H _2> A H _i) 


n 


Pn 


U H _i> 00) 



t 

X 




,'"■: 



Value of Tree 



Figure 11.10.1 

Let Xi (i = \ r 2, ..., «) be the number of trees within the /th class that remain after each harvest. We form a column vector with the 
numbers and call it the nonharvest vector. 

For a sustainable harvesting policy, the forest is to be returned after each harvest to the fixed configuration given by the nonharvest 
vector x. Part of our problem is to find those nonharvest vectors x for which sustainable harvesting is possible. 



Because the total number of trees in the forest is fixed, we can set 

x\ +X2 A \-x n =s 



(1) 



where s is predetermined by the amount of land available and the amount of space each tree requires. Referring to Figure 1 1.10.2, 
we have the following situation. The forest configuration is given by the vector x after each harvest. Between harvests the trees 
grow and produce a new forest configuration before each harvest. A certain number of trees are removed from each class at the 
harvest. Finally, a seedling is planted in place of each tree removed, to return the forest again to the configuration x. 








trees 

removed 



Forest afler jiroivih 



mmm^ 




i- 



Trees not removed 



^;il:k- 

forest 



% 

■i. 



KoresJ before growth 

jnonJiJirvest teetor si 



horcst after h:trvcsT 
{rKjririarYCSl veaor k) 



Figure 11.10.2 



Consider first the growth of the forest between harvests. During this period a tree in the /th class may grow and move up to a 
higher height class. Or its growth may be retarded for some reason, and it will remain in the same class. We consequently define 
the following growth parameters g 2 - for i = 1, 2, ..., h — 1: 

gj = the fraction of trees in the /th class that grow into the (j \ 1 )-st class during a growth period 
For simplicity we assume that a tree can move at most one height class upward in one growth period. With this assumption, we 



have 



1 _ g i = the fraction of trees in the /th class that remain in the /th class during a growth period 

With these w _ ] growth parameters, we form the following HX « growth matrix: 

1-gl 



G = 



gl 


l-g2 











g2 


l-g3 














■ 1-gB-l 











■ Sn-\ 1 



(2) 



Because the entries of the vector x are the numbers of trees in the n classes before the growth period, the reader can verify that the 
entries of the vector 

a-gi)*i 

g\x\ I (l-g2)*2 
g 2 x 2 I (l-g3)*3 



Gx = 

gw-2^M-2"l 0-g«-l)*w-l 

gH-l^H-l +^M 

are the numbers of trees in the n classes after the growth period. 

Suppose that during the harvest we remove ^ (j = 1, 2, ..., m) trees from the /th class. We will call the column vector 

>l 



(3) 



y = 



yn 



the harvest vector. Thus, a total of 

y\+y2 + -+yn 

trees are removed at each harvest. This is also the total number of trees added to the first class (the new seedlings) after each 
harvest. If we define the following nxn replacement matrix 

1 1 ■-■ f 
0-0 



R = 











(4) 



then the column vector 



3y = 



y\+yi + -+yyi 
o 
o 

o 



(5) 



specifies the configuration of trees planted after each harvest. 

At this point we are ready to write the following equation, which characterizes a sustainable harvesting policy: 



configuration 

at end of 
growth period 



[harvest] I 



new seedling 
replacement 



configuration 
at beginning of 
growth period 



or, mathematically, 



Gx-y \-Ry = x 



This equation can be rewritten as 



or, more comprehensively, 



0-1-1 
1 
1 






(I-R)y=(G-l)x 



-Si 
E\ ~E2 



(6) 











-1 


-f 


" y\ 










yi 










73 


= 


1 





yn-\ 







1 


y n 













g2 -g3 







o" 




*2 








*3 


-g H -l 









We will refer to Equation 6 as the sustainable harvesting condition. Any vectors x and y with nonnegative entries, and such that 
*1 I *2 I t,J I x n = s , which satisfy this matrix equation determine a sustainable harvesting policy for the forest. Note that if 
y 1 > 0, then the harvester is removing seedlings of no economic value and replacing them with new seedlings. Because there is no 
point in doing this, we assume that 



yi=0 



With this assumption, it can be verified that 6 is the matrix form of the following set of equations: 

y2+y3 + '~+yn = g\x\ 

y2=g\x\-g2*2 
y3=g2X2-g3*3 

yn-1 = gn-2Xn-2 ~ g n -l*n-l 

Note that the first equation in 8 is the sum of the remaining w _ ] equations. 
Because we must have y i > for i = 2, 3, ..., w, Equations 8 require that 



(7) 



(8) 



(9) 



Conversely, if x is a column vector with nonnegative entries that satisfy Equation 9, then 7 and 8 define a column vector y with 
nonnegative entries. Furthermore, x and y then satisfy the sustainable harvesting condition 6. In other words, a necessary and 
sufficient condition for a nonnegative column vector x to determine a forest configuration that is capable of sustainable harvesting 
is that its entries satisfy 9. 

Optimal Sustainable Yield 

Because we remove y i trees from the /th class (j = 2,3,..., n) an d eac h tree in the /th class has an economic value of p i the total 
yield of the harvest, YId, is given by 



YId = pzy 2 + ^3^3 4- - + Pny n 

Using 8, we may substitute for the y/s in 10 to obtain 

Y2d=p 2 g\x\ -h O3 -P2)g2*2 + -4- (>„ -^ H _i)g H _i^ H _i 



(10) 



(11) 



Combining 11,1, and 9, we can now state the problem of maximizing the yield of the forest over all possible sustainable 
harvesting policies as follows: 



Problem Find nonnegative numbers x\, x 2 , • • ., x n that maximize 

Yld=p 2 g\i;\+ {P3-P2)g2*2 I-+ {Pn-Pn-\)gn-\*n-\ 



subject to 

*1 + *2 H I-*h — s 

and 

£1*1 >£2*2>™>£»-1*m-1 >0 

As formulated above, this problem belongs to the field of linear programming. However, we shall illustrate the following result 
below, without linear programming theory, by actually exhibiting a sustainable harvesting policy. 

THEOREM 11.10.1 



Optimal Sustainable Yield 

The optimal sustainable yield is achieved by harvesting all the trees from one particular height class and none of the trees from 
any other height class. 



Let us first set 

YIdk = yield obtained by harvesting all of the k\h class and none of the other classes 

The largest value of Yld^ for k = 2, 3, ..., n will then be the optimal sustainable yield, and the corresponding value of k will be the 
class that should be completely harvested to attain the optimal sustainable yield. Because no class but the k\h is harvested, we have 

y2=y2 = '-=yk-i=yk+\=-'=yn = § ( 12 ) 

In addition, because all of the kt\i class is harvested, no trees are left unharvested in the Mi class, and no trees are ever present in 
the height classes above the Mi class. Thus, 

x k — x k+\ =-- = x yi = Q .^\ 

Substituting 12 and 13 into the sustainable harvesting condition 8 gives 

yk = g\x\ 

0=gl*l~g2*2 

0= £2*2 -£3*3 

: (14) 

= gk-2*k-2 - gk-l*k-l 
yk=gk-\Xk-\ 



Equations 14 can also be written as 



from which it follows that 



y k = gl* 1 = £2*2 = - = gk-l*k-l 



*2=gl*l/g2 

*3=gl*l tg3 

. (16) 

Xk-l=E!Xltgk-l 



If we substitute Equations 13 and 16 into 

*1 +*2 H I"*h — s 

[which is Equation 1], we can solve for x\ and obtain 

*1 



1 + IL | £L + ... + ^L_ (17) 

g2 g3 gk-l 



For the yield Yld^, we combine 10, 12, 15, and 17 to obtain 

^k=PV2 i P3y3 + - + Pny» 
= pfcyk 

= PkE\x\ 

= em. 

l 



(18) 



gl £2 gfc-1 

Equation 18 determines Hd^ in terms of the known growth and economic parameters for any k = 2,2,...,n. Thus, the optimal 
sustainable yield is found as follows. 



THEOREM 11.10.2 



Finding the Optimal Sustainable Yield 

The optimal sustainable yield is the largest value of 



PkS 




I 


Z2 


+ ■•■ 


I 


1 


Ek-\ 



for k = 2, 3,..-,«. The corresponding value of k is the number of the class that is completely harvested. 



In Exercise 4 we ask the reader to show that the nonharvest vector x for the optimal sustainable yield is 

i/gi 

l/g2 



x = 









^ 






1 




1 






1 




I 




+ ■ 


■ + 








SI 




£2 






gJt-1 










(19) 



Theorem 1 1 . 10.2 implies that it is not necessarily the highest-priced class of trees that should be totally cropped. The growth 
parameters g i must also be taken into account to determine the optimal sustainable yield. 



EXAMPLE 1 Using Theorem 1 1 .1 0.2 



For a Scots pine forest in Scotland with a growth period of six years, the following growth matrix was found (see M. B. Usher, "A 
Matrix Approach to the Management of Renewable Resources, with Special Reference to Selection Forests," Journal of Applied 
Ecology, vol. 3, 1966, pp. 355-367): 

^.72 

.28 .69 

.31 .75 

.25 .77 

.23 .63 

.37 1.00 



G = 



Suppose that the prices of trees in the five tallest height classes are 

z?9 = $50, pt = $100, em = $150, z?<; = $200, z?6 = $250 
Which class should be completely harvested to obtain the optimal sustainable yield, and what is that yield? 



Solution 

From the matrix G we have that 
Equation 18 then gives 



Si =.28, £2 = -31, £? = -25, £4 = -23, si = .37 



Hfl? 2 = 50s/ (28 _1 ) = 14.0s 

H<a? 3 = 100s/ (.28 -1 +.31 -1 ) = 14.7s 

¥Id 4 = 150s/ (.28 _1 + .31" 1 + .25" 1 ) = 13.9s 

Yld 5 = 200sf (.28 _1 + .31" 1 4 .25" 1 4 .23" 1 ) = 13.2s 

Yid t = 250s/ (.28 _1 4 .31 _1 4 .25 _1 4 .23 _1 4 .37" 1 ) = 14.0s 

We see that Yldi is the largest of these five quantities, so from Theorem 1 1 .10.2 the third class should be completely harvested 
every six years to maximize the sustainable yield. The corresponding optimal sustainable yield is $ 14.7s, where s is the total 
number of trees in the forest. 



Exercise Set 11.10 



Click here for Just Ask! 



1. 



"1 
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n 


2 






1 


1 


n 


A 


j 







2 


1 




j 





A certain forest is divided into three height classes and has a growth matrix between harvests given by 



G = 



If the price of trees in the second class is $30 and the price of trees in the third class is $50, which class should be completely 
harvested to attain the optimal sustainable yield? What is the optimal yield if there are 1000 trees in the forest? 

In Example 1, to what level must the price of trees in the fifth class rise so that the fifth class is the one to harvest completely in 
2. order to attain the optimal sustainable yield? 

In Example 1, what must the ratio of the prices p 7 : p^: p 4 : py p^ be in order that the yields Yld^ k = 2,3, 4, 5, 6, all be the 
3« same? (In this case, any sustainable harvesting policy will produce the same optimal sustainable yield.) 

Derive Equation 19 for the nonharvest vector x corresponding to the optimal sustainable harvesting policy described in 

4. Theorem 11.10.2. 

For the optimal sustainable harvesting policy described in Theorem 1 1.10.2, how many trees are removed from the forest 

5. during each harvest? 

If all the growth parameters g\, g2, . . ., g H -i in the growth matrix G are equal, what should the ratio of the prices ^2 : ^3 : • • - : Pn 

6. be in order that any sustainable harvesting policy be an optimal sustainable harvesting policy? (See Exercise 3.) 



Section 11.10 
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Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematical Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

A particular forest has growth parameters given by 

8i = j 

for j = l n 2, 3, ..., « — 1, where n (the total number of height classes) can be chosen as large as needed. Suppose that the value 
of a tree in the klh height interval is given by 

p k = a(k-V) f: 
where a is a constant (in dollars) and p is a parameter satisfying 1 < p < 2. 

(a) Show that the yield yid^ is given by 

mjt _ 2°(*-ir-' s 

k 

(b) For 

p=l.Q, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9 

use a computer to determine the class number that should be completely harvested, and 
determine the optimal sustainable yield in each case. Make sure that you allow k to take on 
only integer values in your calculations. 

(c) Repeat the calculations in part (b) using 

p=1.91, 1.92, 1.93, 1.94, 1.95, 1.96, 1.97, 1.98, 1.99 

(d) Show that if p = 2, then the optimal sustainable yield can never be larger than 2as. 

(e) Compare the values of k determined in parts (b) and (c) to 1 / (2 — p), and use some calculus to explain why 

2-p 



T2. 



A particular forest has growth parameters given by 



2 ] 



for i = 1, 2, 3, ..., n — 1, where n (the total number of height classes) can be chosen as large as needed. Suppose that the val 
of a tree in the klh height interval is given by 

p k = a{k-\Y 



where a is a constant (in dollars) and p is a parameter satisfying 1 < p. 

(a) Show that the yield Yld^ is given by 

2*-2 

(b) For 

p=l, 2, 3, 4, 5, 6, 7, 8, 9, 10 

use a computer to determine the class number that should be completely harvested in order 
to obtain an optimal yield, and determine the optimal sustainable yield in each case. Make 
sure that you allow k to take on only integer values in your calculations. 

(c) Compare the values of k determined in part (b) to 1 | p / b(2) and use some calculus to explain why 

+ ln(2) 
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11.11 

COMPUTER GRAPHICS 



In this section we assume that a view of a three-dimensional object is displayed 
on a video screen and show how matrix algebra can be used to obtain new 
views of the object by rotation, translation, and scaling. 



Prerequisites: Matrix Algebra 

Analytic Geometry 



Visualization of a Three-Dimensional Object 

Suppose that we want to visualize a three-dimensional object by displaying various views of it on a video screen. The object we 
have in mind to display is to be determined by a finite number of straight line segments. As an example, consider the truncated 
right pyramid with hexagonal base illustrated in Figure 11.11.1. We first introduce an ^y^-coordinate system in which to embed 
the object. As in Figure 1 1.1 1.1, we orient the coordinate system so that its origin is at the center of the video screen and the xy 
-plane coincides with the plane of the screen. Consequently, an observer will see only the projection of the view of the 
three-dimensional object onto the two-dimensional ^y-plane. 




Figure 11.11.1 



In the xyz-coordinate system, the endpoints p^ p 2 , . . ., P n of the straight line segments that determine the view of the object will 
have certain coordinates — say, 

Oi , y i , z\ ), (x?, y?, z?), ..., fr„, .y M , z») 

These coordinates, together with a specification of which pairs are to be connected by straight line segments, are to be stored in the 
memory of the video display system. For example, assume that the 12 vertices of the truncated pyramid in Figure 11.11.1 have the 
following coordinates (the screen is 4 units wide by 3 units high): 

Py. (1.000, -.300, .000), 



P r (-.500, -.800, -.866), 
p 5 : (-.500, -.800, .866), 
p 7 : (.840, -.400, .000), 
P 9 : (-.210, .650, -.364), 
P n : (-.210, .650, .364), 

These 12 vertices are connected pairwise by 18 straight line segments as follows, where p i 
to pointy,: 



P 2 : (.500, -.800, -.866), 
P 4 : (-1.000, -.800, .000), 
P 6 : (.500, -.800, .866), 
Pg: (.315, .125, -.546), 
Pio: (-.360, .800, .000), 
P n : (.315, .125, .546), 

Pi denotes that point p i is connected 



Pl»P2. 


P2- 


*P3, 


Pi- 


■Pa, 


P A . >P 5 , 


P5"P&, 


P 6 <>Pi. 


Pl~P%, 


P%- 


*P9, 


Pg. 


■^10. 


Pw»Pn, 


PnoPn, 


P120P7, 


Pl»P7, 


Pi- 


*P%, 


P3- 


■P% 


P40P1Q, 


P5++P11, 


PeoPn, 



In View 1 these 18 straight line segments are shown as they would appear on the video screen. It should be noticed that only the x- 
and ^-coordinates of the vertices are needed by the video display system to draw the view, because only the projection of the object 
onto the ^y-plane is displayed. However, we must keep track of the z-coordinates to carry out certain transformations discussed 
later. 



-1 



1 n J 



& 



View 1 



We now show how to form new views of the object by scaling, translating, or rotating the initial view. We first construct a 3 x n 
matrix P, referred to as the coordinate matrix of the view, whose columns are the coordinates of the n points of a view: 

'xx x 2 - x n 

p ■■ y\ y2 - yyi 

z\ z 2 - z H 



For example, the coordinate matrix P corresponding to View 1 is the 3x12 matrix 

1.000 .500 -.500 -1.000 -.500 .500 .840 .315 

-.800 -.800 -.800 -.300 -.800 -.300 -.400 .125 

.000 -.366 -.366 .000 .366 .366 .000 -.546 



-.210 -.360 -.210 .315 

.650 .300 .650 .125 

-.364 .000 .364 .546 



We will show below how to transform the coordinate matrix P of a view to a new coordinate matrix p* corresponding to a new 
view of the object. The straight line segments connecting the various points move with the points as they are transformed. In this 
way, each view is uniquely determined by its coordinate matrix once we have specified which pairs of points in the original view 
are to be connected by straight lines. 

Scaling 

The first type of transformation we consider consists of scaling a view along the x, y, and z directions by factors of n , -i, and ■;,, 
respectively. By this we mean that if a point p i has coordinates (x^y^, z l ■) in the original view, it is to move to a new point p{ with 
coordinates ( { lXir Jy ir -.zf) in the new view. This has the effect of transforming a unit cube in the original view to a rectangular 
parallelepiped of dimensions a x ,5 x 7 (Figure 1 1.1 1.2). Mathematically, this may be accomplished with matrix multiplication as 
follows. Define a 3 x 3 diagonal matrix 
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Figure 11.11.2 

Then, if a point /? . in the original view is represented by the column vector 






then the transformed point p{ is represented by the column vector 



\A] 




'a 0" 


r*i] 


y\ 
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,3 


yi 


A 
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Zi 



Using the coordinate matrix P, which contains the coordinates of all n points of the original view as its columns, we can transform 
these n points simultaneously to produce the coordinate matrix P f of the scaled view, as follows: 
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ol 


,3 








7 J 



*1 *2 

SP 71 72 

z\ *2 

'ax i QX2 — GXyi 

&y\ $yi - Py* 

fZ\ "<z 2 - "fZ„ 



yn 



= P' 



The new coordinate matrix can then be entered into the video display system to produce the new view of the object. As an 
example, View 2 is View 1 scaled by setting Q = 1.8, ff = 0.5, and n = 3.0. Note that the scaling -, = 3.0 along the z-axis is not 
visible in View 2, since we see only the projection of the object onto the xy-plane. 
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View 2 



View 1 scaled by tt = 1 . 8, & = 0. 5, 7 = 3. 0. 



Translation 

We next consider the transformation of translating or displacing an object to a new position on the screen. Referring to Figure 
11.11.3, suppose we desire to change an existing view so that each point p i with coordinates (x i? y u z 2 ) moves to a new point p f 
with coordinates (*. | XUr y. \ yQrZ . \ ZQ ), The vector 

^0 



^0,+-*^v,-i-r ,; r + = nj ) 
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*o ■ 


■■ XQ~ 


70 


70 ■ 


" 70 
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Figure 11.11.3 

is called the translation vector of the transformation. By defining a 3 x n matrix 7as 

T = 

we can translate all n points of the view determined by the coordinate matrix P by matrix addition via the equation 

P' = P+T 
The coordinate matrix P ! then specifies the new coordinates of the n points. For example, if we wish to translate View 1 according 
to the translation vector 

"1.2" 
0.4 
1.7 

the result is View 3. Note, again, that the translation ZQ = \j along the z-axis does not show up explicitly in View 3. 

-2 -10 12 



I 
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View 3 



View 1 translated by * = 1.2, y = 0.4, zq = 1.7 



In Exercise 7, a technique of performing translations by matrix multiplication rather than by matrix addition is explained. 

Rotation 

A more complicated type of transformation is a rotation of a view about one of the three coordinate axes. We begin with a rotation 



about the z-axis (the axis perpendicular to the screen) through an angle 0- Given a point p i in the original view with coordinates 
(xj, 7 j, Zj)» we W i s ^ t0 compute the new coordinates r x f y z {\ of the rotated point p{ . Referring to Figure 1 1.1 1.4 and using a 
little trigonometry, the reader should be able to derive the following: 

= pcos(<£ + 0) =pcos0cosO — psin^sinO = jfjcosfl— ^isinO 

= p sin(<$ + fl) = p cos <A sin 3 + p sin <A cos 9 = Xjsin 9 + y^ cos 3 



= ^j 



P#M*«) 




FjCrjO^) 



Figure 11.11.4 



These equations can be written in matrix form as 



^ 3 " 



cos0 — sin0 

sin0 cos0 

1 



Zi 



If we let R denote the 3 x 3 matrix in this equation, all n points can be rotated by the matrix product 

P* = RP 
to yield the coordinate matrix p f of the rotated view. 

Rotations about the x- and y-axes can be accomplished analogously, and the resulting rotation matrices are given with Views View 
4, View 5, and View 6. These three new views of the truncated pyramid correspond to rotations of View 1 about the x-, y-, and 
z-axes, respectively, each through an angle of 90°. 
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View 4 



View 1 rotated 90° about the x-axis. 



Rotation about iht y-axis 
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View 5 



View 1 rotated 90° about the y-axis. 



Rotation about the r-ciMS 
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View 6 



View 1 rotated 90° about the z-axis. 



Rotations about three coordinate axes may be combined to give oblique views of an object. For example, View 7 is View 1 rotated 
first about the x-axis through 30°, then about the y-axis through -70°, and finally about the z-axis through -27°. Mathematically, 
these three successive rotations can be embodied in the single transformation equation p ! = RP RP, where R is the product of three 
individual rotation matrices: 



*i = 



Ri = 



R 3 = 



1 

cos(30°) -sin(30°) 

sin(30°) cos(30°) 

cos(-70°) sin(-70°) 

1 

-sin(-70°) cos( -70°) 

cos(-27°) -sin(-27°) 

sin(-27°) cos(-27°) 

1 



in the order 



R = R 3 R 2 R\ = 



.305 -.025 -.952 
.155 .985 -.076 
.940 .171 .296 





: -l ii i : 












1 




A 


\ 









eft 


^A 




1 


<t_ 


A> 


—1 















View 7 



Oblique view of truncated pyramid. 



As a final illustration, in View 8 we have two separate views of the truncated pyramid, which constitute a stereoscopic pair. They 
were produced by first rotating View 7 about the y-axis through an angle of -3° and translating it to the right, then rotating the 
same View 7 about the y-axis through an angle of +3° and translating it to the left. The translation distances were chosen so that 

the stereoscopic views are about 2-^- inches apart — the approximate distance between a pair of eyes. 





View 8 



Stereoscopic figure of truncated pyramid. The three-dimensionality of the diagram can be seen by holding the book 
about one foot away and focusing on a distant object. Then by shifting gaze to View 8 without refocusing, you can make 
the two views of the stereoscopic pair merge together and produce the desired effect. 



Exercise Set 11.11 



& 



Click here for Just Ask! 



View 9 is a view of a square with vertices (0, 0, 0), (1, 0, 0), (1, 1, 0), and (0, 1, 0). 



(a) What is the coordinate matrix of View 9? 



(b) What is the coordinate matrix of View 9 after it is scaled by a factor 1 -^ in the x-direction ad -^ in the y-direction? Dra^ 
a sketch of the scaled view. 
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View 9 



Square with vertices (0, 0, 0), (1, 0, 0), (1, 1, 0), and (0, 1, 0) (Exercises 1 and 2). 



(c) What is the coordinate matrix of View 9 after it is translated by the following vector? 



-2 

-1 

3 



Draw a sketch of the translated view. 



(d) What is the coordinate matrix of View 9 after it is rotated through an angle of -30° about the z-axis? Draw a sketch of 
the rotated view. 



(a) If the coordinate matrix of View 9 is multiplied by the matrix 



M 





1 








1 



the result is the coordinate matrix of View 10. Such a transformation is called a shear in the 
x-direction with factor -i with respect to the y-coordinate. Show that under such a 

transformation, a point with coordinates {x u y u zi) has new coordinates ^ | l y . r y irZi )- 
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View 10 



View 9 sheared along the x-axis by -i with respect to the y-coordinate (Exercise 2). 



(b) What are the coordinates of the four vertices of the shear square in View 10? 



(c) The matrix 



"1 





0" 


.6 


1 











1 



determines a shear in the y -direction with factor .6 with respect to the x-coordinate (an 
example appears in View 11). Sketch a view of the square in View 9 after such a shearing 
transformation, and find the new coordinates of its four vertices. 
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View 1 1 



View 1 sheared along the j-axis by .6 with respect to the x-coordinate (Exercise 2). 



(a) The reflection about the X z~plane is defined as the transformation that takes a point {x i9 y i9 z 2 ) to the point (^ — y i? z 2 ) 
(e.g., View 12). If P and p' are the coordinate matrices of a view and its reflection about the ^-plane, respectively, find 
a matrix M such that P f = MP. 
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View 12 



View 1 reflected about the ^z-plane (Exercise 3). 



(b) Analogous to part (a), define the reflection about the yz-plane and construct the corresponding transformation matrix. 
Draw a sketch of View 1 reflected about the yz-plane. 

(c) Analogous to part (a), define the reflection about the xy-plane and construct the corresponding transformation matrix. 
Draw a sketch of View 1 reflected about the xy-plane. 



(a) View 13 is View 1 subject to the following five transformations: 

Scale by a factor of -J- in the x-direction, 2 in the y-direction, and ~- in the z-direction. 
1. 2 J 3 



Translate 4: unit in the x-direction. 
2. 2 



Rotate 20° about the x-axis. 



3. 



Rotate -45° about the y-axis. 



Rotate 90° about the z-axis. 



Construct the five matrices M\> M*# M^, M%> an d M$ associated with these five transformations. 
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View 13 



View 1 scaled, translated, and rotated (Exercise 4). 



(b) If P is the coordinate matrix of View 1 and P f is the coordinate matrix of View 13, express P f in terms of M 1? M^ M^ 
M 4 , M 5 , and P. 



(a) View 14 is View 1 subject to the following seven transformations: 

Scale by a factor of .3 in the x-direction and by a factor of .5 in the y-direction. 
1. 

Rotate 45° about the x-axis. 



Translate 1 unit in the x-direction. 



3. 



Rotate 35° about the y-axis. 



Rotate -45° about the z-axis. 



5. 



Translate 1 unit in the z-direction. 



Scale by a factor of 2 in the x-direction. 



Construct the matrices j\£^ 9 M^---^ My associated with these seven transformations. 
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View 14 



View 1 scaled, translated, and rotated (Exercise 5). 



(b) If P is the coordinate matrix of View 1 and P* is the coordinate matrix of View 14, express P* in terms of Mi>M2>- * •' 
M 7 , and P. 



Suppose that a view with coordinate matrix P is to be rotated through an angle q about an axis through the origin and specifiec 
by two angles a and .j (see Figure Ex-6). If P f is the coordinate matrix of the rotated view, find rotation matrices R^ P^ R^ R 
and R^ such that 

P f =R 5 R A R 3 R 2 RiP 




Figure Ex-6 



Hint The desired rotation can be accomplished in the following five steps: 



1. Rotate through an angle of 3 about the y-axis. 



2. Rotate through an angle of a about the z-axis. 



3. Rotate through an angle of i) about the y-axis. 



4. Rotate through an angle of — < i about the z-axis. 



5. Rotate through an angle of —3 about the j-axis. 



This exercise illustrates a technique for translating a point with coordinates (x i? y i? z 2 ) to a point with coordinates 
(xj I XQ,yj i 70,^3' I zq) by matrix multiplication rather than matrix addition. 



(a) Let the point (x ir y ir z 2 ) be associated with the column vector 



v,- = 



^2 

l 
and let the point ( Xi \ x^y^ i y^Zi i z ) be associated with the column vector 



v;= 



Zj+ZQ 

l 



Find a 4 x4 matrix M such that v ' = ilfv,- 



(b) Find the specific 4x4 matrix of the above form that will effect the translation of the point (4, -2, 3) to the point (-1,7, 
0). 



8. 



For the three rotation matrices given with Views View 4, View 5, and View 6, show that 
(A matrix with this property is called an orthogonal matrix. See Section 6.5.) 



Section 11.11 



@ 



Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematical Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Let (a ? i ? c) be a unit vector normal to the plane a x + by 4- cz = 0, and let r = (^ ? y, z) be a vector. It can be shown that the 
Tl. mirror image of the vector r through the above plane has coordinates Tm = (x m? y m? z m ), where 



Xm 




~x~ 


y™ 


= M 


y 


z m 




z 



with 



M = I-2\m 1 = 



"l 0" 




'a' 


1 


-2 


b 


1 




c 



[a b c] 



(a) Show that M 2 = / an d give a physical reason why this must be so. 
Hint Use the fact that ( a? b, c) is a unit vector to show that n ^ n — ]. 

(b) Use a computer to show that det(il^) = — 1- 

(c) The eigenvectors of M satisfy the equation 



^m 




~x~ 




~x~ 


y™ 


= M 
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= X 
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z m 




z 




z 



and therefore correspond to those vectors whose direction is not affected by a reflection 
through the plane. Use a computer to determine the eigenvectors and eigenvalues of M, and 
then give a physical argument to support your answer. 



T2. 



A vector v = (x 9 y , z) is rotated by an angle about an axis having unit vector (a 9 b,c)> thereby forming the rotated vector 
V R — ( X R> y& Z R)- ^ can ^ e shown that 



~*R~ 




~x~ 


?R 


= R{9) 


y 


ZR 




z 



with 



R(9)=cos(9) 



1 
1 
1 



I (l-cos(0)) 



[a b c] 



+ sin(0) 



-c b 
c — a 
-b a 



(a) Use a computer to show that R(ff)R{ip) =R(9 + <p), and then give a physical reason why this must be so. Depending 
on the sophistication of the computer you are using, you may have to experiment using different values of a, b, and 

c = \j\-a 2 -b 2 



(b) Show also that R (9) =R( — 9) and give a physical reason why this must be so. 



(c) Use a computer to show that det(R(9)) = + 1- 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



11.12 

EQUILIBRIUM 

TEMPERATURE 

DISTRIBUTIONS 



In this section we shall see that the equilibrium temperature distribution within a 
trapezoidal plate can be found when the temperatures around the edges of the 
plate are specified. The problem is reduced to solving a system of linear 
equations. Also, an interactive technique for solving the problem and a "random 
walk" approach to the problem are described. 



Prerequisites: Linear Systems 
Matrices 



Intuitive Understanding of Limits 



Boundary Data 

Suppose that the two faces of the thin trapezoidal plate shown in Figure 1 1.12.1a are insulated from heat. Suppose that we are also 
given the temperature along the four edges of the plate. For example, let the temperature be constant on each edge with values of 
0°, 0°, 1°, and 2°, as in the figure. After a period of time, the temperature inside the plate will stabilize. Our objective in this 
section is to determine this equilibrium temperature distribution at the points inside the plate. As we will see, the interior 
equilibrium temperature is completely determined by the boundary data — that is, the temperature along the edges of the plate. 





EMM") 



j. no 



\h) 



The equilibrium temperature distribution can be visualized by the use of curves that connect points of equal temperature. Such 
curves are called isotherms of the temperature distribution. In Figure 1 1.1 2.1 b we have sketched a few isotherms, using 
information we derive later in the chapter. 

Although all our calculations will be for the trapezoidal plate illustrated, our techniques generalize easily to a plate of any practical 
shape. They also generalize to the problem of finding the temperature within a three-dimensional body. In fact, our "plate" could 
be the cross section of some solid object if the flow of heat perpendicular to the cross section is negligible. For example, Figure 
1 1.12.1 could represent the cross section of a long dam. The dam is exposed to three different temperatures: the temperature of the 
ground at its base, the temperature of the water on one side, and the temperature of the air on the other side. A knowledge of the 
temperature distribution inside the dam is necessary to determine the thermal stresses to which it is subjected. 

Next, we shall consider a certain thermodynamic principle that characterizes the temperature distribution we are seeking. 

The Mean-Value Property 



There are many different ways to obtain a mathematical model for our problem. The approach we use is based on the following 
property of equilibrium temperature distributions. 



THEOREM 11.12.1 



The Mean-Value Property 

Let a plate be in thermal equilibrium and let P be a point inside the plate. Then if C is any circle with center at P that is 
completely contained in the plate , the temperature at P is the average value of the temperature on the circle ( Figure 11.12.2). 




Figure 11.12.2 



This property is a consequence of certain basic laws of molecular motion, and we will not attempt to derive it. Basically, this 
property states that in equilibrium, thermal energy tends to distribute itself as evenly as possible consistent with the boundary 
conditions. It can be shown that the mean- value property uniquely determines the equilibrium temperature distribution of a plate. 

Unfortunately, determining the equilibrium temperature distribution from the meanvalue property is not an easy matter. However, 
if we restrict ourselves to finding the temperature only at a finite set of points within the plate, the problem can be reduced to 
solving a linear system. We pursue this idea next. 

Discrete Formulation of the Problem 



We can overlay our trapezoidal plate with a succession of finer and finer square nets or meshes (Figure 1 1.12.3). In (a) we have a 
rather coarse net; in (b) we have a net with half the spacing as in (a); and in (c) we have a net with the spacing again reduced by 
half. The points of intersection of the net lines are called mesh points. We classify them as boundary mesh points if they fall on the 
boundary of the plate or as interior mesh points if they lie in the interior of the plate. For the three net spacings we have chosen, 
there are 1, 9, and 49 interior mesh points, respectively. 



1 

I 




Q 






'fl \ 


■ ■ 

1 


► < 


• 




■i 


► — * 


II 


ii 












I ■ 




i 














i 


i" 4 


■. J 


i.— | 


' 4 


^ ■ ^ 


ii 


|| 


■• 




r 1 
















































■- 


















1 1 


.. 


► * 


* ■ 1 








fe A 






■ u 






















- 
« 


^4 


±-4 


^H 


► -H 


t --, 


* i 


f 1 


t— < 


> 



1 1 I 

{a} I inferior mesh point 



I I 1 I I 

(h) 9 interior mesh points 



111(11)11 
{£') 4*> interior mesh poinls 



Figure 11.12.3 



In the discrete formulation of our problem, we try to find the temperature only at the interior mesh points of some particular net. 
For a rather fine net, as in (c), this will provide an excellent picture of the temperature distribution throughout the entire plate. 

At the boundary mesh points, the temperature is given by the boundary data. (In Figure 1 1.12.3 we have labeled all the boundary 
mesh points with their corresponding temperatures.) At the interior mesh points, we shall apply the following discrete version of 
the mean- value property. 

THEOREM 11.12.2 



Discrete Mean-Value Property 

At each interior mesh point, the temperature is approximately the average of the temperatures at the four neighboring mesh 
points. 



This discrete version is a reasonable approximation to the true mean- value property. But because it is only an approximation, it 
will provide only an approximation to the true temperatures at the interior mesh points. However, the approximations will get 
better as the mesh spacing decreases. In fact, as the mesh spacing approaches zero, the approximations approach the exact 
temperature distribution, a fact proved in advanced courses in numerical analysis. We will illustrate this convergence by computing 
the approximate temperatures at the mesh points for the three mesh spacings given in Figure 1 1.12.3. 

Case {a) of Figure 1 1.12.3 is simple, for there is only one interior mesh point. If we let ^ Q be the temperature at this mesh point, the 
discrete mean- value property immediately gives 

^ = 1(2 + 1+0 + 0) = 0.75 

In case (b) we can label the temperatures at the nine interior mesh points ^, ^ • • •» ^ as in Figure 1 1.12.3Z?. (The particular 
ordering is not important.) By applying the discrete mean- value property successively to each of these nine mesh points, we obtain 
the following nine equations: 

*1 = ^2 + 2 + + 0) 

*2 = ^(*l+*3+*4+2) 
^3 = ^2+^ + + 0) 

U=^2+h \ t 7 I 2) 

*5 = ^(*3+*4 + *6+*8) (1) 

*6 = ^(*J+*9 + + 0) 

*7 = ^(*4 + *8+l+2) 

ts = ±{£ 5 + t 7 +£ 9 +l) 

*9 = ^(*6 + *8+l+0) 
This is a system of nine linear equations in nine unknowns. We can rewrite it in matrix form as 

t=*ft i h (2) 

where 
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To solve Equation 2, we write it as 
The solution for t is thus 



as long as the matrix (/ _ M) is invertible. This is indeed the case, and the solution for t as calculated by 3 is 



(3) 



t = 



0.7846 


1.1383 


0.4719 


1.2967 


0.7491 


0.3265 


1.2995 


0.9014 


0.5570 



(4) 



Figure 1 1.12.4 is a diagram of the plate with the nine interior mesh points labeled with their temperatures as given by this solution. 

: 




i i i 

Figure 11.12.4 



For case (c) of Figure 1 1.12.3, we repeat this same procedure. We label the temperatures at the 49 interior mesh points as ^, ^ • • • > 

in some manner. For example, we may begin at the top of the plate and proceed from left to right along each row of mesh points. 
Applying the discrete mean- value property to each mesh point gives a system of 49 linear equations in 49 unknowns: 

*l = ^2 + 2 + + 0) 
*2 = ;J;C*l+*3 + *4 + 2) 



(5) 



*48= 4^41 +^47 +^49 + 1) 
*49 = ^42 + *4S + 0+l) 

In matrix form, Equations 5 are 

t = Mt + b 
where t and b are column vectors with 49 entries, and M is a 49 x 49 matrix. As in 3, the solution for t is 

t=(I-M)~ { h 



(6) 



In Figure 1 1.1 2. 5 we display the temperatures at the 49 mesh points found by Equation 6. The nine unshaded temperatures in this 
figure fall on the mesh points of Figure 1 1.12.4. In Table 1 we compare the temperatures at these nine common mesh points for the 
three different mesh spacings used. 




Figure 11.12.5 

Table 1 



Temperatures at Common Mesh Points 



Case (a) 



Case (b) Case (c) 



t\ — 

t2 — 

t2 — 

t A — 

t 5 0.7500 

H — 

t 7 — 

^ — 

t 9 — 



Knowing that the temperatures of the discrete problem approach the exact temperatures as the mesh spacing decreases, we may 
surmise that the nine temperatures obtained in case (c) are closer to the exact values than those in case (b). 

A Numerical Technique 

To obtain the 49 temperatures in case (c) of Figure 1 1 .12.3, it was necessary to solve a linear system with 49 unknowns. A finer 
net might involve a linear system with hundreds or even thousands of unknowns. Exact algorithms for the solutions of such large 
systems are impractical, and for this reason we nowdiscuss a numerical technique for the practical solution of these systems. 



0.7846 


0.8048 


1.1383 


1.1533 


0.4719 


0.4778 


1.2967 


1.3078 


0.7491 


0.7513 


0.3265 


0.3157 


1.2995 


1.3042 


0.9014 


0.9032 


0.5570 


0.5554 



To describe this technique, we look again at Equation (2): 



t = il^t + b 



(7) 



The vector t we are seeking appears on both sides of this equation. We consider a way of generating better and better 
approximations to the vector solution t. For the initial approximation t® we may take ^) — Q if no better choice is available. If we 
substitute t® into the right side of 7 and label the resulting left side as jO), we have 



tV = Mt<® + h 



(8) 



Usually t 1 ^ is a better approximation to the solution than is t® . If we substitute t 1 ^ into the right side of 7, we generate another 
approximation, which we label t (2): 



Continuing in this way, we generate a sequence of approximations as follows: 



(9) 



tV = Mt® I b 
t (") = Mt ("-l) + b 



(10) 



One would hope that this sequence of approximations t <H>, ^C 1 ), ft), . . . converges to the exact solution of 7. We do not have the 
space here to go into the theoretical considerations necessary to show this. Suffice it to say that for the particular problem we are 
considering, the sequence converges to the exact solution for any mesh size and for any initial approximation t®. 



This technique of generating successive approximations to the solution of 7 is a variation of a technique called Jacobi iteration; 
the approximations themselves are called iterates. As a numerical example, let us apply Jacobi iteration to the calculation of the 
nine mesh point temperatures of case (b). Setting t® = 0> we have, from Equation 2, 
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Some additional iterates are 



t®= 



0.6875 


0.8906 


0.2344 


0.9688 


0.3750 


0.1250 


1.0781 


0.6094 


0.3906 



t0°) = 



0.7791 


1.1230 


0.4573 


1.2770 


0.7236 


0.3131 


1.2848 


0.8827 


0.5446 



t (20D = 



0.7845 
1.1380 
0.4716 
1.2963 
0.7486 
0.3263 
1.2992 
0.9010 
0.5567 



tC^ = 



0.7846 
1.1383 
0.4719 
1.2967 
0.7491 
0.3265 
1.2995 
0.9014 
0.5570 



All iterates beginning with the thirtieth are equal to t^ to four decimal places. Consequently, ft®) is the exact solution to four 



decimal places. This agrees with our previous result given in Equation 4. 

The Jacobi iteration scheme applied to the linear system 5 with 49 unknowns produces iterates that begin repeating to four decimal 
places after 1 19 iterations. Thus, t^ 115 ^ would provide the 49 temperatures of case (c) correct to four decimal places. 



A Monte Carlo Technique 

In this section we describe a so-called Monte Carlo technique for computing the temperature at a single interior mesh point of the 
discrete problem without having to compute the temperatures at the remaining interior mesh points. First we define a discrete 
random walk along the net. By this we mean a directed path along the net lines (Figure 1 1.12.6) that joins a succession of mesh 
points such that the direction of departure from each mesh point is chosen at random. Each of the four possible directions of 
departure from each mesh point along the path is to be equally probable. 











t 


UJ 




i -. 












ii r - ' J 


V 




\ 


' 


li 


• i 











lu- 
ll } \ 

Figure 11.12.6 



By the use of random walks, we can compute the temperature at a specified interior mesh point on the basis of the following 
property. 

THEOREM 11.12.3 



Random Walk Property 



Let W\y Wy •••> W n be a succession of random walks, all of which begin at a specified interior mesh point. Let i *, i *, ..., t * 
the temperatures at the boundary mesh points first encountered along each of these random walks. Then the average value 

fl* |_ i * _| h£*)/tf °f these boundary temperatures approaches the temperature at the specified interior mesh point as the 

number of random walks n increases without bound. 



be 



This property is a consequence of the discrete mean- value property that the mesh point temperatures satisfy. The proof of the 
random walk property involves elementary concepts from probability theory, and we will not give it here. 

In Table 2 we display the results of a large number of computer- generated random walks for the evaluation of the temperature ^ of 
the nine-point mesh of case (b) in Figure 1 1.12.6. The first column lists the number n of the random walk. The second column lists 
the temperature ^ * of the boundary point first encountered along the corresponding random walk. The last column contains the 

cumulative average of the boundary temperatures encountered along the n random walks. Thus, after 1000 random walks we have 
the approximation ^ ^ .7550- This compares with the exact value ^ — .7491 that we had previously evaluated. As can be seen, the 
convergence to the exact value is not too rapid. 
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Exercise Set 11.12 



® 



Click here for Just Ask! 



A plate in the form of a circular disk has boundary temperatures of 0° on the left of its circumference and 1° on the right half c 
!• its circumference. A net with four interior mesh points is overlaid on the disk (see Figure Ex-1). 



(a) Using the discrete mean- value property, write the 4 x 4 linear system t = Ml I h that determines the approximate 
temperatures at the four interior mesh points. 

(b) Solve the linear system in part (a). 

(c) Use the Jacobi iteration scheme with jfP) — to generate the iterates t$\ t©, t®, jX^, and t© for the linear system in 
part (a). What is the "error vector" t® _ t, where t is the solution found in part (b)? 



(d) By certain advanced methods, it can be determined that the exact temperatures to four decimal places at the four mesh 
points are ^ — ^ — .2871 and ^ 2 ^ ^4 ^ .7129- What are the percentage errors in the values found in part (b)? 




Figure Ex-1 



Use Theorem 1 1.12.1 to find the exact equilibrium temperature at the center of the disk in Exercise 1. 



Calculate the first two iterates jX 1 ^ and j© for case (b) of Figure 1 1.12.3 with nine interior mesh points [Equation (2)] when the 
**• initial iterate is chosen as 

t®=ri 1 1 1 1 1 1 1 n r 



4. 



The random walk illustrated in Figure Ex-4a can be described by six arrows 

that specify the directions of departure from the successive mesh points along the path. Figure Ex-4b is an array of 100 
computer-generated, randomly oriented arrows arranged in a ] Q x 1 array. Use these arrows to determine random walks to 
approximate the temperature ^, as in Table 2. Proceed as follows: 



1. Take the last two digits of your telephone number. Use the last digit to specify a row and the other to specify a column. 



2. Go to the arrow in the array with that row and column number. 



3. Using this arrow as a starting point, move through the array of arrows as you would read a book (left to right and top to 
bottom). Beginning at the point labeled ^ in Figure Ex-4a and using this sequence of arrows to specify a sequence of 
directions, move from mesh point to mesh point until you reach a boundary mesh point. This completes your first rando 
walk. Record the temperature at the boundary mesh point. (If you reach the end of the arrow array, continue with the 
arrow in the upper left corner.) 



4. Return to the interior mesh point labeled t$ and begin where you left off in the arrow array; generate your next random 
walk. Repeat this process until you have completed 10 random walks and have recorded 10 boundary temperatures. 



Calculate the average of the 10 boundary temperatures recorded. (The exact value is ^ = .7491.) 
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Figure Ex-4 



Section 11.12 



® 



Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



Suppose that we have the square region described by 

R= {(x,y)\0<x<\,Q<y<l} 

and suppose that the equilibrium temperature distribution u { Xr y) along the boundary is given by u (x, 0) = T& u(x, 1) = 7 
u(0 ? y) = Ti> an d u (1, y) = T& Suppose next that this region is partitioned into an (^ | 1) x (« + 1) mesh using 



Xj = — and 



yi = n 



for j = 0, 1,2, . . ., n and j = 0, 1, 2,. . ., n. If the temperatures of the interior mesh points are labeled by 

Uij = u {x u y{) =u{iin,j!n) 

then show that 

for i = 1, 2, 3, ..., n — 1 and j = 1, 2, 3, ..., « — 1. To handle the boundary points, define 

"0j = T L, u n,j = T& u ifi = T E , and u Un = T T 
for i = 1, 2, 3, ..., « - 1 and j = 1, 2, 3, ..., m - 1. Next let 



p n +l = 



/„ 

1 



be the (« + 1) x (« 4- 1) matrix with the n x « identity matrix in the upper right-hand corner, a one in the lower left-hand 
corner, and zeros everywhere else. For example, 



F 7 = 



1 

1 



F 3 = 
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0" 
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1 









F* = 



1 o o" 


10 
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10 



^5 = 
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1 















and so on. By defining the (« 4. 1) x (n 4- 1) matrix 



/„ 

1 



/„ 

1 



iT 



show that if U„+\ is the ( M | 1) x (« 4- 1) matrix with entries u,j, then the set of equations 

"y = 4 < a *-ij + U i+U + "y-i + u y+i) 

for j = 1, 2, 3, ...,« — 1 and j = 1, 2, 3, ...,«— 1 can be written as the matrix equation 

^H + l = ^"(^H + l^H + l + ^H + l^H + l) 

where we consider only those elements of U n +\ with i = 1, 2, 3, ...,n— 1 and j = 1, 2, 3, ..., w — 1. 

The results of the preceding exercise and the discussion in the text suggest the following algorithm for solving for the 
T2. equilibrium temperature in the square region 

11= {(x,y)\0<x<l,0<y<l) 

given the boundary conditions 

u(x,0) = T B , u{x,\) = T T , u(0,y)=T L , u{\,y) = T R 



1. Choose a value for n, and then choose an initial guess, say 



U H + 1 



T L 

T B 

7* 

T R 



T L 


" 





r r 





T T 


Tr 






2- For each value of k = 0, 1, 2, 3, . . ., compute tv^ 1 ) using 

H + l 

where M H+ i is as defined in Exercise Tl. Then adjust jy(* +1 ) by replacing all edge entries by t 

initial edge entries in jy® . 

JVofe The edge entries of a matrix are the entries in the first and last columns and first and last rows. 



Continue this process until jy(^ +1 ) _ jj( k ) is approximately the zero matrix. This suggests that 



U n+l = Hm Uf +l 

k — ■* oo 

Use a computer and this algorithm to solve for u ( x? y) given that 

u(x, 0) = 0, u(x, 1) = 0, u(0,y) = 0, u(l,.y) = 2 

Choose a == g and compute up to 77^ . The exact solution can be expressed as 

uf x v \ = 8 V sinh[(2ffg-l)m]sin[(2w3- l)ny] 
k '^ ir^ (2*H-l)sinh[(2*ra-l)?r] 

Use a computer to compute u(i f S r j f 6) f° r h j = 0, 1, 2, 3, 4, 5, 6, and then compare your results to the values of 

Using the exact solution u (^ ^) for the temperature distribution described in Exercise T2, use a graphing program to do the 
T3. following: 

(a) Plot the surface ^ == M (* ? 7 ) in three-dimensional xyz-space in which z is the temperature at the point ( x? y ) i n the 
square region. 

(b) Plot several isotherms of the temperature distribution (curves in the *y-plane over which the temperature is a 
constant). 

(c) Plot several curves of the temperature as a function of x with y held constant. 

(d) Plot several curves of the temperature as a function of y with x held constant. 
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11.13 

COMPUTED 
TOMOGRAPHY 



In this section we shall see how constructing a cross-sectional view of a human 
body by analyzing X-ray scans leads to an inconsistent linear system. We 
present an iteration technique that provides an "approximate solution" of the 
linear system. 



Prerequisites: Linear Systems 

Natural Logarithms 
Euclidean Space R n 

The basic problem of computed tomography is to construct an image of a cross section of the human body using data collected 
from many individual beams of X rays that are passed through the cross section. These data are processed by a computer, and the 
computed cross section is displayed on a video monitor. Figure 1 1.13.1 is a diagram of General Electric's CT system showing a 
patient prepared to have a cross section of his head scanned by X-ray beams. 




Figure 11.13-1 

Such a system is also known as a CAT scanner, for Computer- Aided Tomography scanner. Figure 1 1.13.2 shows a typical cross 
section of a human head produced by the system. 




Figure 11.13.2 



The first commercial system of computed tomography for medical use was developed in 1971 by G. N. Hounsfield of EMI, Ltd., 
in England. In 1979, Houndsfield and A. M. Cormack were awarded the Nobel Prize for their pioneering work in the field. As we 
will see in this section, the construction of a cross section, or tomograph, requires the solution of a large linear system of equations. 
Certain algorithms, called (ARTs), can be used to solve these linear systems, whose solutions yield the cross sections in digital 
form. 



Scanning Modes 

Unlike conventional X-ray pictures that are formed by X rays that are projected perpendicular to the plane of the picture, 
tomographs are constructed from thousands of individual, hairline-thin X-ray beams that lie in the plane of the cross section. After 
they pass through the cross section, the intensities of the X-ray beams are measured by an X-ray detector, and these measurements 
are relayed to a computer where they are processed. Figures 1 1.13.3 and 1 1.13.4 illustrate two possible modes of scanning the 
cross section: the parallel mode and the fan-beam mode. In the parallel mode a single X-ray source and X-ray detector pair are 
translated across the field of view containing the cross section, and many measurements of the parallel beams are recorded. Then 
the source and detector pair are rotated through a small angle, and another set of measurements is taken. This is repeated until the 
desired number of beam measurements is completed. For example, in the original 1971 machine, 160 parallel measurements were 
taken through 180 angles spaced 1° apart: a total of 160 x 180 = 28,800 beam measurements. Each such scan took approximately 

5-^- minutes. 
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Figure 11.13.3 



Parallel mode. 
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Figure 11.13.4 



Fan-beam mode. 



In the fan-beam mode of scanning, a single X-ray tube generates a fan of collimated beams whose intensities are measured 
simultaneously by an array of detectors on the other side of the field of view. The X-ray tube and detector array are rotated through 
many angles, and a set of measurements is taken at each angle until the scan is completed. In the General Electric CT system, 
which uses the fan-beam mode, each scan takes 1 second. 



Derivation of Equations 



To see how the cross section is reconstructed from the many individual beam measurements, refer to Figure 1 1.13.5. Here the field 
of view in which the cross section is situated has been divided into many square pixels (picture elements) numbered 1 through N as 
indicated. It is our desire to determine the X-ray density of each pixel. In the EMI system, 6400 pixels were used, arranged in a 
square 80 x 80 array. The G.E. CT system uses 262,144 pixels ina512x512 array, each pixel being about 1 mm on a side. After 
the densities of the pixels are determined by the method we will describe, they are reproduced on a video monitor, with each pixel 
shaded a level of gray proportional to its X-ray density. Because different tissues within the human body have different X-ray 
densities, the video display clearly distinguishes the various tissues and organs within the cross section. 
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Figure 11.13.5 

Figure 11.13.6 shows a single pixel with an X-ray beam of roughly the same width as the pixel passing squarely through it. The 
photons constituting the X-ray beam are absorbed by the tissue within the pixel at a rate proportional to the X-ray density of the 
tissue. Quantitatively, the X-ray density of the jth pixel is denoted by xj and is defined by 

number of photons entering the jth pixel \ 
number of photons leaving the jth pixel J 

where "In" denotes the natural logarithmic function. Using the logarithm property ln(a / i) = -h(i/fl) ? we also have 



Xj=]n 



Xj= -In 



(fraction of photons that pass through \ 
, the jth pmel without being absorbed J 



If the X-ray beam passes through an entire row of pixels (Figure 11.13.7), then the number of photons leaving one pixel is equal to 
the number of photons entering the next pixel in the row. If the pixels are numbered 1, 2,. . ., n, then the additive property of the 
logarithmic function gives 

_, / number of 'photons entering the first pixel \ 
n \ number of photons leaving the nth pixel J 



= -In 



' fraction of photons that pass 

through the row of n pixels 

without being absorbed 



(i) 



Thus, to determine the total X-ray density of a row of pixels, we simply sum the individual pixel densities. 
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Figure 11.13.6 
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iplh pixel 



Next, consider the X-ray beam in Figure 1 1.13.5. By the beam density of the /th beam of a scan, denoted by £ ., we mean 

\ number of photons of the ith beam entering the detector 
without the cross section in thefkld of view 



£ 2 =ln 



\ 



= -In 



number of photons of the ith beam entering the detector 
with the cross section in thefkld of view 

fraction of photons of the ith beam that pass through] 
the cross section without being absorbed J 



(2) 



The numerator in the first expression for £ . is obtained by performing a calibration scan without the cross section in the field of 
view. The resulting detector measurements are stored within the computer's memory. Then a clinical scan is performed with the 
cross section in the field of view, the £ 3 -'s of all the beams constituting the scan are computed, and the values are stored for further 
processing. 

For each beam that passes squarely through a row of pixels, we must have 

^fraction of photons of the beam\ I fraction of photons of the beam ^ 



that pass through the row of 
pixels without being absorbed 



i 



that pass through the cross section 
without being absorbed 



Thus, if the /th beam passes squarely through a row of n pixels, then it follows from Equations 1 and 2 that 

x\ +X2-\-'- + Xn=bi 

In this equation, £ . is known from the clinical and calibration measurements, and x\, X2> • • •» x n are unknown pixel densities that 
must be determined. 



More generally, if the /th beam passes squarely through a row (or column) of pixels with numbers j^ 9 J2> •••» jp then we have 
If we set 



*jl+*j2 + -" + *,■,■ = ** 



1, i£j = JuJ2>-»,Ji 



J I 0, otherwise 



then we may write this equation as 



Ail*! + (312*2 H 1- aiN*N = *i 

We shall refer to Equation 3 as the ith beam equation. 



(3) 



Referring to Figure 11.13.5, however, we see that the beams of a scan do not necessarily pass through a row or column of pixels 
squarely. Instead, a typical beam passes diagonally through each pixel in its path. There are many ways to take this into account. In 
Figure 1 1.13.8 we outline three methods of defining the quantities ay that appear in Equation 3, each of which reduces to our 
previous definition when the beam passes squarely through a row or column of pixels. Reading down the figure, each method is 
more exact than its predecessor, but with successively more computational difficulty. 



Center- of-E'ixel Method 
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Onrcr Line Method 
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Figure 11.13.8 



Using any one of the three methods to define the fly s in the /th beam equation, we can write the set of M beam equations in a 
complete scan as 

All*! + fli2*2 H r- a\N*N = *1 

fl21*l + ^22* 2 +■"+ a2N*N = h 

&M1X1 I a M2*2 I L,J I aMN*N=hM 
In this way we have a linear system of M equations (the M beam equations) in N unknowns (the N pixel densities). 



(4) 



Depending on the number of beams and pixels used, we may have M > N, M = N> or M < N- We will consider only the case 
M> N, the so-called over determined case, in which there are more beams in the scan than pixels in the field of view. Because of 
inherent modeling and experimental errors in the problem, we should not expect our linear system to have an exact mathematical 
solution for the pixel densities. In the next section we attempt to find an "approximate" solution to this linear system. 

Algebraic Reconstruction Techniques 

There have been many mathematical algorithms devised to treat the overdetermined linear system 4. The one we will describe 
belongs to the class of so-called Algebraic Reconstruction Techniques (ARTs). This method, which can be traced to an iterative 
technique originally introduced by S. Kaczmarz in 1937, was the one used in the first commercial machine. To introduce this 
technique, consider the following system of three equations in two unknowns: 

L\\ x\+ *2= 2 

L 2 : x\-2x 2 = -2 (5) 

Z 3 : 3;q- *2= 3 

The lines £ 1? £ 2 , I3 determined by these three equations are plotted in the x i^-plane. As shown in Figure 1 1.13.9a, the three lines 
do not have a common intersection, and so the three equations do not have an exact solution. However, the points Oq, ^2) on the 
shaded triangle formed by the three lines are all situated "near" these three lines and can be thought of as constituting 
"approximate" solutions to our system. The following iterative procedure describes a geometric construction for generating points 
on the boundary of that triangular region (Figure 1 1 . 13. 9b): 



Algorithm 1 



Step 0. Choose an arbitrary starting point xq i n the xi^-plane. 

Step 1. Project xg orthogonally onto the first line £ 1 and call the projection C 1 ). The superscript (1) indicates that this is the 
first of several cycles through the steps. 

Step 2. Project C 1 ) orthogonally onto the second line £ 2 and call the projection C 1 ). 
Step 3. Project C 1 ) orthogonally onto the third line £ 3 and call the projection C 1 ). 



Step 4. Take ^.C 1 ) as the new value of xq and cycle through Steps 1 through 3 again. In the second cycle, label the projected 
points ©, ©, ©; in the third cycle, label the projected points (3), ©, ^)\ and so forth. 
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This algorithm generates three sequences of points 



Ly 



Ci) © (?) 

Xj , X^ , Xj , 

* G) v® v® 

(I) Q © 
x 3 , x 3 , x 3 , 



that lie on the three lines £ l5 £ 2 ' anc ^ £3' respectively. It can be shown that as long as the three lines are not all parallel, then the 
first sequence converges to a point x * on £j, the second sequence converges to a point x * on £ 2 > an d the third sequence converges 
to a point x * on £ 3 (Figure 1 1.13.9c). These three limit points form what is called the limit cycle of the iterative process. It can be 
shown that the limit cycle is independent of the starting point xq. 

We next discuss the specific formulas needed to effect the orthogonal projections in Algorithm 1. First, because the equation of a 



line in jr^-space is 

we can express it in vector form as 

where 



31*1 +a 2 *2=£ 



z T x = b 



n= „ and x = 

L 2 J 

The following theorem gives the necessary projection formula (Exercise 5). 



<*2 



*1 
*2 



THEOREM 11.13.1 



Orthogonal Projection Formula 

Let L be a line in p} with equation a ^ x _ £, and let x * be any point in p} (Figure 11.13.10). Then the orthogonal projection, 
Xp, of x * onto L is given by 

(&-a r x*) 
x p =x + — —a 




Figure 11.13.10 



EXAMPLE 1 Using Algorithm 1 



We can use Algorithm 1 to find an approximate solution of the linear system given in 5 and illustrated in Figure 1 1.13.9. If we 
write the equations of the three lines as 

T 
L\\ a j x = b i 

T 
Z3: a 3 x = i3 



where 



x = 



*2 



ai = 



*2 = 



1 
-2 



a 3 = 



3 
-1 



4l = 2, 4 2 =-2. ^3 = 3 
then, using Theorem 1 1.13.1, we can express the iteration scheme in Algorithm 1 as 

t(p) 



(p) 



= *g 1 + (A *-;^-D ajtt jt=1.2.3 



a f a * 



where ;? = 1 for the first cycle of iterates, p = 2 for the second cycle of iterates, and so forth. After each cycle of iterates [that is, 



after x Cf) is computed], the next cycle of iterates is begun with xq set equal to Cp). 

Table 1 gives the numerical results of six cycles of iterations starting with the initial point X q = (1, 3)- 

Table 1 



xo 


1.00000 


3.00000 


x Q 


.00000 


2.00000 


x G) 
x 2 


.40000 


1.20000 


x 3 


1.30000 


.90000 


x® 
x l 


1.20000 


.80000 


x® 
x 2 


.88000 


1.44000 


x 3 


1.42000 


1.26000 



*1 



*2 



.(3D 1.08000 .92000 



4 



_G> 



A 3 



.83200 
1.40800 



1.41600 
1.22400 



(4) 1.09200 .90800 



(4) .83680 1.41840 



(4) 1.40920 1.22760 



*? 



1.09080 .90920 



(5) .83632 1.41816 



*3 



1.40908 1.22724 



® 1.09092 .90908 
x l 



(6) .83637 1.41818 
x 2 





*1 


*2 


x 3 


1.40909 


1.22728 



Using certain techniques that are impractical for large linear systems, we can show the exact values of the points of the limit cycle 
in this example to be 



x* = ffi, -j^M = (1.09090..., .90909...) 



x* = ^,±|} = (.83636..., 1.41818...) 

x 3 * = f|i ? -gj = (1.40909..., 1.22727...) 

It can be seen that the sixth cycle of iterates provides an excellent approximation to the limit cycle. Any one of the three iterates 
x ®, ® , 05 can be used as an approximate solution of the linear system. (The large discrepancies in the values of ® , ® , and 

x ® are due to the artificial nature of this illustrative example. In practical problems, these discrepancies would be much smaller.) 

To generalize Algorithm 1 so that it applies to an overdetermined system of M equations in N unknowns, 

a 11*1 + a\2*2 H 1- a\N*N = *1 

fl21*l + ^22* 2 H 1- &2NXN = h 



(6) 



flMl^l + fl M2*2H \-aMN*N=hM 



we introduce column vectors x and a, as follows: 

*2 



x = 



<ii = 



«i2 



i=l,2,...,M 



afx = ij, 



With these vectors, the M equations constituting our linear system 6 can be written in vector form as 

Each of these M equations defines what is called a hyperplane in the TV-dimensional Euclidean space g^. In general these M 
hyperplanes have no common intersection, and so we seek instead some point in R^ that is reasonably "close" to all of them. Such 
a point will constitute an approximate solution of the linear system, and its TV entries will determine approximate pixel densities 
with which to form the desired cross section. 

As in the two-dimensional case, we will introduce an iterative process that generates cycles of successive orthogonal projections 
onto the M hyperplanes beginning with some arbitrary initial point in p^ . Our notation for these successive iterates is 



(p^ J the iterate lying on the kth hyperplane \ 
[generated during thepth cycle of iterations J 



*k 



The algorithm is as follows: 



Algorithm 2 



Step 0. Choose any point in R N anc j label it xq- 



Step 1. For the first cycle of iterates, set p = 1. 



Step 2. For k = 1, 2, . . . , M, compute 






TjP) 



Step 3. SetJP+ 1 ')_(P). 

Step 4. Increase the cycle number/? by 1 and return to Step 2. 

In Step 2 the iterate ^(p) is called the orthogonal projection of ^P) onto the hyperplane a Tx = &&-• Consequently, as in the 

two-dimensional case, this algorithm determines a sequence of orthogonal projections from one hyperplane onto the next in which 
we cycle back to the first hyperplane after each projection onto the last hyperplane. 

It can be shown that if the vectors ai , a 2, • • •, a m s P an 5^» then the iterates x 0-\ S 2 ), ,-t 3 ), . . . lying on the Mth hyperplane will 

M M M 

converge to a point x * on that hyperplane which does not depend on the choice of the initial point xq. In computed tomography, 
one of the iterates (P) for p sufficiently large is taken as an approximate solution of the linear system for the pixel densities. 

Note that for the center-of-pixel method, the scalar quantity a ^ a , appearing in the equation in Step 2 of the algorithm is simply the 
number of pixels in which the Ml beam passes through the center. Similarly, note that the scalar quantity 

in that same equation can be interpreted as the excess kth beam density that results if the pixel densities are set equal to the entries 
of X C?) . This provides the following interpretation of our ART iteration scheme for the center-of-pixel method: Generate the pixel 

K 1 

densities of each iterate by distributing the excess beam density of successive beams in the scan evenly among those pixels in 
which the beam passes through the center. When the last beam in the scan has been reached, return to the first beam and continue. 



EXAMPLE 2 Using Algorithm 2 

We can use Algorithm 2 to find the unknown pixel densities of the 9 pixels arranged in the 3 x 3 array illustrated in Figure 
11.13.11. These 9 pixels are scanned using the parallel mode with 12 beams whose measured beam densities are indicated in the 
figure. We choose the center-of-pixel method to set up the 12 beam equations. (In Exercises 7 and 8, the reader is asked to set up 
the beam equations using the center line and area methods.) As the reader can verify, the beam equations are 

*7 + *g I x 9 = 13.00 * 3 I x& I 7:9 = 13.00 

^4 + ^5 + ^6 = 15.00 x 2 +X5 + X2 = 12.00 

*l+*2 + *3= S.00 ^1 +^4-1-^7 = 6.00 

X£-\-X2-\-X9= 14.79 X2-\-X3+X6= 10.51 

*3 + *5 + *7 = 14.31 x\ -\-x 5 -\-x 9 = 16.13 

*l+*2 + *4= 3.81 X4 + x-j + x%= 7.04 
Table 2 illustrates the results of the iteration scheme starting with an initial iterate XQ — p. The table gives the values of each of the 
first cycle of iterates, ^.0) through ^.C 1 ), but thereafter gives the iterates Jp) only for various values of p. The iterates Jp) start 

repeating to two decimal places for p > 45, and so we take the entries of X C 4 ^) as approximate values of the 9 pixel densities. 
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Figure 11.13.11 
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Table 2 





Pixel Densities 






*i 


X 2 


X .K 


x i 


x s 


** 


*7 


■ V K 


*l 


II 

i 

— 

! 

"ts* 

IF 


x- 

.1 

til 
1? 


.00 

.00 

.00 

2.67 

2.67 

2.67 

.49 

.49 

.49 

-.31 

-.31 

1.06 

1.06 


.00 

.00 

.00 

2.67 

2.67 

2.67 

.49 

.49 

.84 

.84 

.13 

.13 

.13 


.00 
.00 
.00 
2.67 
2.67 
3.44 
3.44 
4.93 
4.93 
493 
4.22 
4.22 
4.22 


.00 
.00 

5.00 
5.00 

loo 

5.W 
2X3 
2.83 
2.83 

2.02 

2.02 

2.02 

.58 


.00 
.00 
5.00 
5.110 
5.00 
5.77 
5.77 
5.77 
6.11 
6.11 
6.11 
7.49 
7.49 


.00 
.00 
5.00 
5.00 
5.37 
5.37 
5.37 
6.87 
6.87 
6.87 
6.16 
6.16 
6.16 


.00 
4.33 
4.33 
4.33 
4.33 
5.10 
5.10 
5.10 
5.10 
4.30 
4.30 
4.30 
2.85 


.00 
4,33 
4.33 
4,33 
4.71 
1.71 
4.71 
4,71 
5.05 
5.05 
5.05 
5,05 
3.61 


.00 

4.33 
4.33 
4.33 
4.71 
4,71 
4.71 
6,20 
6.20 
6.20 
6.20 
7.58 
7.58 




X i2i 


2,03 


.69 


4.42 


1 .34 


7,49 


5.39 


2.65 


3,04 


6.61 




is, 

*I2 


J. 78 


.51 


4.52 


1.26 


7.49 


5.48 


2.56 


3.22 


6.86 






1,82 


.52 


4.fi2 


1.37 


7.49 


5.37 


2.45 


3.22 


6.82 




A I2 


1.79 


.49 


4.71 


1 13 


7.49 


5.3 1 


2.37 


3.25 


6.85 




v 
]2 


1.68 


.44 


5.03 


1.70 


7.49 


5.03 


2.04 


3.29 


6.96 




X J2 


1.49 


-48 


5.29 


2.00 


7,49 


4,73 


1.79 


3.25 


7.15 




*I3 


1.38 


.55 


5.34 


2.11 


7.49 


4.62 


1.74 


3.19 


7.26 




A J2 


1.33 


.59 


5.33 


2.14 


7.49 


4.59 


1.75 


3,15 


7.31 




* 12 


1.32 


,60 


5.32 


115 


7.49 


4.59 


L76 


3.14 


7.32 



We close this section by noting that the field of computed tomography is presently a very active research area. In fact, the ART 
scheme discussed here has been replaced in commercial systems by more sophisticated techniques that are faster and provide a 
more accurate view of the cross section. However, all the new techniques address the same basic mathematical problem: finding a 
good approximate solution of a large overdetermined inconsistent linear system of equations. 



Exercise Set 11. 13 



@ 



Click here for Just Ask! 



( a ) Setting x &) _ /^(P) x &\ 9 show that the three projection equations 



x^-x^ I 



<»*-fr£,) 



for the three lines in Equation 5 can be written as 



aft, 



£=1,2,3 



.(P) 1 



.GO .(P), 



i=l: 



£=2: 



£=3: 



ah — 2^ + * 01 - * 02 ^ 

^12 — 2 — 01 02 

^ = I[-2 + 4^ + 23 rg5] 



(P) 1 



.Cp) , ?»i 



A 3i =To [9 ' * 21 + 3 * 22 ] 

T tP)_ 1 r , , o t CP) . g 00, 

*32 - ^q L - ^ I ^21 I y *22 J 



where r>+i) >+i\ , GO >X for^ = l, 2, .... 

^01 ■ *02 > — ^31 > *32 J 
(b) Show that the three pairs of equations in part (a) can be combined to produce 



31 _ 20 L 31 32 J 

^ = ^[24 I3, 31 



Cp-1) __o-i) 



J? =1,2, 



3x 



32 



where ^ ©, _ , G) ^__0). 

^31 ' ^32 J — v^Ol > *02 ^ — *0 
Afote Using this pair of equations, we can perform one complete cycle of three orthogonal projections in a single step. 

( c ) Because X G0 tends to the limit point x * as p — > co , the equations in part (b) become 

* i * * 

*31 = 20 [ 28+jr 31 -^32] 

^32=^[ 24 + 3 4- 3 ^] 

as p — > ^ . Solve this linear system for x * = (****). 

Afote The simplifications of the ART formulas described in this exercise are impractical for the large linear systems that 
arise in realistic computed tomography problems. 

Use the result of Exercise 1(b) to find C 1 ), GO, . . ., (3) to five decimal places in Example 1 using the following initial points: 
(a) i =(0.0) 



(b) x =(l, 1) 



(c) io = (148. -15) 



3. 



(a) Show directly that the points of the limit cycle in Example 1, 



*_/]2 101 * _ (46_ 78 A * _ /3i 27 A 

Xl ~Ul'llJ' 2 ~\55'55}' X3 ~\22'22} 



(b) 



form a triangle whose vertices lie on the lines L\, L 2 , and l 3 and whose sides are perpendicular 
to these lines (Figure 11.13.9c). 

Using the equations derived in Exercise 1(a), show that if x^ = xo = (— , — J then 



(1) 

x l = x l 


= fI2 
111' 


10 
11 


(1) 

x 2 = x 2 


/46 
"155' 


78 
55 


(1) 

x 3 — x 3 




27 
99 



Note Either part of this exercise shows that successive orthogonal projections of any point on 
the limit cycle will move around the limit cycle indefinitely. 

The following three lines in the x^-plane, 
4. 



Ly. 


*2=1 


Lr- 


x\ -*2 = 2 


Lr- 


*l -^2 = 



do not have a common intersection. Draw an accurate sketch of the three lines and graphically performseveral cycles of the 
orthogonal projections described in Algorithm 1, beginning with the initial point XQ = (0, 0)- On the basis of your sketch, 
determine the three points of the limit cycle. 

Prove Theorem 11.13.1 by verifying that 
5. 

(a) the point x p as defined in the theorem lies on the line a ^ x — ^ (i.e., a T x =bh 

(b) the vector x „ — x * is orthogonal to the line a ^ x — ^ (that is, x „ — x * is parallel to a). 



As stated in the text, the iterates ^.C 1 ) , __©, ^.C 3 !), . . . defined in Algorithm 2 will converge to a unique limit point v * if the 

6. X M X M X M fc & X M 

vectors ai, &2 ? • • -, ^m s P an R N - Show that if this is the case and if the center-of-pixel method is used, then the center of each of 

the N pixels in the field of view is crossed by at least one of the M beams in the scan. 

Construct the 12 beam equations in Example 2 using the center line method. Assume that the distance between the center lines 

7. of adjacent beams is equal to the width of a single pixel. 

Construct the 12 beam equations in Example 2 using the area method. Assume that the width of each beam is equal to the width 

8. of a single pixel and that the distance between the center lines of adjacent beams is also equal to the width of a single pixel. 



Section 11.13 



@ 



Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematical Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Given the set of equations 
Tl 

for £ = 1, 2, 3, . . ., n (withn n > 2), let us consider the following algorithm for obtaining an approximate solution to the 
system. 

1 Solve all possible pairs of equations 

a{x I h{y = Cj and a<x + b<y = c< 

for /, / = l, 2, 3, ..., n and i< j for their unique solutions. This leads to 

i»C»-l) 

solutions, which we label as 

for /, / = l, 2, 3, ..., n and ;< ; , 
2. Construct the geometric center of these points defined by 



0c>7c) = 



( rj n— 1 n ^ h— 1 h 



and use this as the approximate solution to the original system. 
Use this algorithm to approximate the solution to the system 

*+ y= 2 

x-2y= -2 

3*- 7= 3 
and compare your results to those in this section. 

T2. (For Readers Who Have Studied Calculus) 

Given the set of equations 

a k x + b k y = c k 

for £ — l, 2, 3, . . ., n (with H > 2), let us consider the following least squares algorithm for obtaining an approximate solutk 
(x ,y ) to the system. Given a point ( a? tf) and the line a^x I b^y = c? the distance from this point to the line is given by 

If we define a function f(x,y)by 



1=1 



4 , *? 



and then determine the point {x ,y ) that minimizes this function, we will determine the point that is closest to each of these 
lines in a summed least squares sense. Show that x * and y are solutions to the system 



- H a 1 



\x + 






\ 



and 



Apply this algorithm to the system 



fc_fl*L.W 



H «? I *? J 



■ " A" 4 



\y* = T, 



aic 



iij_ 



* - 2^ = - 2 
3x- y= 3 



and compare your results to those in this section. 



i=l af I bf 
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11.14 

FRACTALS 



In this section we shall use certain classes of linear transformations to describe 
and generate intricate sets in the Euclidean plane. These sets, called fractals, 
are currently the focus of much mathematical and scientific research. 



Prerequisites: 


Geometry of Linear Operators on p 2 (Section 9.2) 




Euclidean Space R n 




Natural Logarithms 




Intuitive Understanding of Limits 



Fractals in the Euclidean Plane 

At the end of the nineteenth century and the beginning of the twentieth century, various bizarre and wild sets of points in the 
Euclidean plane began appearing in mathematics. Although they were initially mathematical curiosities, these sets, called frac tals, 
are rapidly growing in importance. It is now recognized that they reveal a regularity in physical and biological phenomena 
previously dismissed as "random," "noisy," or "chaotic." For example, fractals are all around us in the shapes of clouds, 
mountains, coastlines, trees, and ferns. 

In this section we give a brief description of certain types of fractals in the Euclidean plane p 2 . Much of this description is an 
outgrowth of the work of two mathematicians, Benoit B. Mandelbrot and Michael Barnsley, who are both active researchers in the 
field. 



Self-Similar Sets 

To begin our study of fractals, we need to introduce some terminology about sets in j? 2 . We shall call a set in j? 2 bounded if it can 
be enclosed by a suitably large circle (Figure 1 1.14.1) and closed if it contains all of its boundary points (Figure 1 1.14.2). Two sets 
in R 2 will be called congruent if they can be made to coincide exactly by translating and rotating them appropriately within p 2 
(Figure 1 1.14.3). We will also rely on the reader's intuitive concept of overlapping and nonoverlapping sets, as illustrated in 
Figure 11.14.4. 



£ nc Losing 
circle 




(u) Set enclosed by a circle 



*>' 



Unbounded *ci 






(/?) This &el cannot be 
enclosed by any circle. 

Figure 11.14.1 



Figure 11.14.2 



The boundary points (solid color) lie in the set. 




Figure 11.14.3 




(m) Overlapping sets 




(/?) NoRoverlapping sets 
Figure 11.14.4 

If y- R 2 > R 2 is the linear operator that scales by a factor of s (see Table 8 of Section 4.2), and if Q is a set in g2\ then the set 

T(Q) (the set of images of points in Q under T) is called a dilation of the set <2 if s > 1 an d a contraction of 2 if < s < 1 (Figure 
1 1.14.5). In either case we say that T(Q) is the set Q scaled by the factors. 



i? 



(j yKE;D-« ^ (ag? 



1 



Figure 11.14.5 



A contraction of <2. 



The types of fractals we shall consider first are called self-similar. In general, we define a self-similar set in R 2 as follows: 



DEFINITION 



A closed and bounded subset of the Euclidean plane R 2 is said to be self -similar if it can be expressed in the form 

S = S { uS 2 uS 3 u~uS k 

where £ 1? S^^' • • •> Sk are nonoverlapping sets, each of which is congruent to S scaled by the same factor s(0<s<l). 



(1) 



If S is a self- similar set, then 1 is sometimes called a decomposition of S into nonoverlapping congruent sets. 



EXAMPLE 1 Line Segment 



A line segment in ^2 (Figure 1 1.14.6a) can be expressed as the union of two nonoverlapping congruent line segments (Figure 
1 1.14.6/?). In Figure 1 1.14.6Z? we have separated the two line segments slightly so that they can be seen more easily. Each of these 
two smaller line segments is congruent to the original line segment scaled by a factor of -^. Hence, a line segment is a self- similar 

set with k = 2 and s = -^. 



un 



U>) 



Figure 11.14.6 



EXAMPLE 2 Square 



A square (Figure 1 1.14.7a) can be expressed as the union of four nonoverlapping congruent squares (Figure 11.14.7Z?), where we 
have again separated the smaller squares slightly. Each of the four smaller squares is congruent to the original square scaled by a 



factor of Tj-. Hence, a square is a self- similar set with k = 4 and s = -k- 




UO 




EXAMPLE 3 Sierpinski Carpet 



The set suggested by Figure 11.14.8a, the Sierpinski "carpet," was first described by the Polish mathematician Waclaw Sierpinski 



(1882-1969). It can be expressed as the union of eight nonoverlapping congruent subsets (Figure 1 1.14.8Z?), each of which is 
congruent to the original set scaled by a factor of -jr. Hence, it is a self- similar set with k = S and s = -^. Note that the intricate 

square-within-a-square pattern continues forever on a smaller and smaller scale (although this can only be suggested in a figure 
such as the one shown). 




(a) 




m 

Figure 11.14.8 



EXAMPLE 4 Sierpinski Triangle 



Figure 1 1.14.9a illustrates another set described by Sierpinski. It is a self-similar set with k = 3 and s = -^ (Figure 1 1.14.9Z?). As 
with the Sierpinski carpet, the intricate triangle- within- a- triangle pattern continues forever on a smaller and smaller scale. 
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i. 
ill 





bJ 




i_ i ft t St Et., 
-■Khhr..K.l:, h. k K t> K 



Figure 11.14.9 




</>> 



The Sierpinski carpet and triangle have a more intricate structure than the line segment and the square in that they exhibit a pattern 
that is repeated indefinitely. This difference will be explored later in this section. 

Topological Dimension of a Set 

In Section 5.4 we defined the dimension of a subspace of a vector space to be the number of vectors in a basis, and we found that 



definition to coincide with our intuitive sense of dimension. For example, the origin of R 2 is zero-dimensional, lines through the 
origin are one-dimensional, and ^ 2 itself is two-dimensional. This definition of dimension is a special case of a more general 
concept called topological dimension, which is applicable to sets in j?" that are not necessarily subspaces. A precise definition of 
this concept is studied in a branch of mathematics called topology. Although that definition is beyond the scope of this text, we can 
state informally that 

■ 
a point in j? 2 has topological dimension zero; 

a curve in j? 2 has topological dimension one; 

* 
a region in j? 2 has topological dimension two. 

It can be proved that the topological dimension of a set in R n must be an integer between and n, inclusive. In this text we shall 
denote the topological dimension of a set S by dj(Ef)- 



EXAMPLE 5 Topological Dimensions of Sets 

Table 1 gives the topological dimensions of the sets studied in our earlier examples. The first two results in this table are intuitively 
obvious; however, the last two are not. Informally stated, the Sierpinski carpet and triangle both contain so many "holes" that those 
sets resemble web-like networks of lines rather than regions. Hence they have topological dimension one. The proofs are quite 
difficult. 



♦ 



Table 1 



Set5 d T (S) 

Line segment 1 

Square 2 

Sierpinski carpet 1 

Sierpinski triangle 1 



Hausdorff Dimension of a Self-Similar Set 

In 1919 the German mathematician Felix Hausdorff (1868-1942) gave an alternative definition for the dimension of an arbitrary 
set in R™. His definition is quite complicated, but for a self- similar set, it reduces to something rather simple: 



DEFINITION 



The Hausdorff dimension of a self-similar set S of form 1 is denoted by d^(ff) and is defined by 



d^S) = 



MUJL 



(2) 



In this definition, " In" denotes the natural logarithm function. Equation 2 can also be expressed as 

s =k (3) 

in which the Hausdorff dimension d^j(S) appears as an exponent. Formula 3 is more helpful for interpreting the concept of 
Hausdorff dimension; it states, for example, that if you scale a self-similar set by a factor of s = -^, then its area (or, more properly, 

{ i \dj£S) i 

its measure) decreases by a factor of M- . Thus, scaling a line segment by a factor of -±- reduces its measure (length) by a 



_ , and scaling a square region by a factor of ^ reduces its measure (area) by a factor of f! ] = 1. 



factor of (1) = 1 
\2) 2 

Before proceeding to some examples, we should note a few facts about the Hausdorff dimension of a set: 



The topological dimension and Hausdorff dimension of a set need not be the same. 



The Hausdorff dimension of a set need not be an integer. 



The topological dimension of a set is less than or equal to its Hausdorff dimension; that is, dj(S) < tf #(£)• 



EXAMPLE 6 Hausdorff Dimensions of Sets 



Table 2 lists the Hausdorff dimensions of the sets studied in our earlier examples. 



Table 2 



Set 5 


s 


k 


j__/<^i — h-k 


dii{n) ~ ln(l/s) 


Line segment 


1 
2 


2 


In2/ln2 = l 


Square 


1 
2 


4 


In4/ln2 = 2 


Sierpinski carpet 


1 
3 


8 


In8/ln3 = 1.892... 


Sierpinski triangle 


1 
2 


3 


In3/ln2 = 1.584... 



Fractals 

Comparing Tables 1 and 2, we see that the Hausdorff and topological dimensions are equal for both line segment and square but 
are unequal for the Sierpinski carpet and triangle. In 1977 Benoit B. Mandelbrot suggested that sets for which the topological and 



Hausdorff dimensions differ must be quite complicated (as Hausdorff had earlier suggested in 1919). Mandelbrot proposed calling 
such sets fractals, and he offered the following definition. 



DEFINITION 



A fractal is a subset of a Euclidean space whose Hausdorff dimension and topological dimension are not equal. 



Mandelbrot has suggested that this definition is rather limited and probably will be replaced in the future. But in the meantime, it 
remains the formal definition of a fractal. According to this definition, the Sierpinski carpet and Sierpinski triangle are fractals, 
whereas the line segment and square are not. 

It follows from the preceding definition that a set whose Hausdorff dimension is not an integer must be a fractal (why?). However, 
we will see later that the converse is not true; that is, it is possible for a fractal to have an integer Hausdorff dimension. 

Similitudes 

We shall now show how some techniques from linear algebra can be used to generate fractals. This linear algebra approach also 
leads to algorithms that can be exploited to draw fractals on a computer. We begin with a definition. 



DEFINITION 



A similitude with scale factor s is a mapping of $} into $} of the form 

T 
where s, 0, e, and/ are scalars. 



= s 



cos0 — sin0 
sin0 cos 9 



I 



Geometrically, a similitude is a composition of three simpler mappings: a scaling by a factor of s, a rotation about the origin 
through an angle ft and a translation (e units in the x-direction and/units in the ^-direction). Figure 1 1.14.10 illustrates the effect 
of a similitude on the unit square U. 
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Figure 11.14.10 

For our application to fractals, we shall need only similitudes that are contractions, by which we mean that the scale factor s is 
restricted to the range < s < 1- Consequently, when we refer to similitudes we shall always mean similitudes subject to this 
restriction. 

Similitudes are important in the study of fractals because of the following fact: 



If j- p^ ^ pi is a similitude with scale factor s and ifS is a closed and bounded set in pi, then the image T(S) of the set S 

under T is congruent to S scaled by s. 



Recall from the definition of a self-similar set in p2 that a closed and bounded set S in p2 is self-similar if it can be expressed in 
the form 

where £j, £ 2 > Sy • • •» S^ are nonoverlapping sets each of which is congruent to S scaled by the same factor s(0<s<Y) [see 1]. In 
the following examples, we will find similitudes that produce the sets #j, ^ Sy • • •» S^ fr° m S for the line segment, square, 
Sierpinski carpet, and Sierpinski triangle. 



EXAMPLE 7 Line Segment 



We shall take as our line segment the line segment S connecting the points (0, 0) and (1, 0) in the ^-plane (Figure 1 1.14.1 la). 
Consider the two similitudes 
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both of which have s = -jr and Q = Q. In Figure 11.14.11Z?we show how these two similitudes map the unit square U. The 

similitude f\ maps U onto the smaller square T\ ( U), and the similitude 7 2 maps U onto the smaller square T2(U)- At the same 
time, t*j maps the line segment S 1 onto the smaller line segment T\ {S), and 7^ maps S onto the smaller nonoverlapping line 
segment ^(S)- The union of these two smaller nonoverlapping line segments is precisely the original line segment S; that is, 



s=ri(S)uT 2 (S) 
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Figure 11.14.11 



EXAMPLE 8 Square 



Let us consider the unit square U in the ^-plane (Figure 1 1.14.12a) and the following four similitudes, all having s = 4- and = 
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The images of the unit square U under these four similitudes are the four squares shown in Figure 1 1.14.12ft. Thus, 

U=Ti(U)uT 2 (U)\)T3(U)uT 4 (U) 

is a decomposition of U into four nonoverlapping squares that are congruent to U scaled by the same scale factor s = — . 



(7) 



EXAMPLE 9 Sierpinski Carpet 



Let us consider a Sierpinski carpet 5 over the unit square U of the xy-plane (Figure 1 1.14.13a) and the following eight similitudes, 
all having s= -^ and Q = Q: 
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where the eight values of 
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The images of 5 under these eight similitudes are the eight sets shown in Figure 11. 14.13ft. Thus, 

S = 7*1 (£) U 7 2 (£) U T 3 (S)~ U TgOS 1 ) 

is a decomposition of 5 into eight nonoverlapping sets that are congruent to 5 scaled by the same scale factor s = — . 



(9) 




m 

Figure 11.14.13 



EXAMPLE 1 Sierpinski Triangle 



Let us consider a Sierpinski triangle S fitted inside the unit square U of the ;cy-plane, as shown in Figure 1 1 . 14. 14a, and the 
following three similitudes, all having s= -^- and = 0- 



*( 


~x~ 

y 


H 


"1 o" 

1_ 


~x~ 

y 






*( 


~x~ 

y 


H 


"1 0" 
1_ 


~x~ 

y 


i 


"l" 

2 
_0_ 


*( 


~x~ 

y 


H 


"1 0" 
1_ 


~x~ 

y 


i 


"0" 

1 
_2_ 



(10) 



The images of S under these three similitudes are the three sets in Figure 11.14.14ft. Thus, 

is a decomposition of 5 into three nonoverlapping sets that are congruent to S scaled by the same scale factor Is = — J. 
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In the preceding examples we started with a specific set S and showed that it was self-similar by finding similitudes 7^, 7^, 7* 3 , . . ., 
Tk with the same scale factor such that T\ (S), 7*2 0^' 7*3(£)> • • •> TfcOS 1 ) were nonover l a Ppi n g set s and such that 



S= 7*i(S) u T 2 (S) u T 3 (S) U...U T k (S) 



(12) 



The following theorem addresses the converse problem of determining a self-similar set from a collection of similitudes. 



THEOREM 11.14.1 



IfT\> Tj> Ty •••> T^y are contracting similitudes with the same scale factor, then there is a unique nonempty closed and 
bounded set S in the Euclidean plane such that 

S= 7*1 (£) u T 2 (S) u T 3 (S) u ... u T k (S) 
Furthermore, if the sets T\(S), 7*2(5), 7*3 (S) # ■■■/ T^OSf) are nonoverlapping, then S is self-similar. 



Algorithms for Generating Fractals 

In general, there is no simple way to obtain the set S in the preceding theorem directly. We now describe an iterative procedure that 
will determine S from the similitudes that define it. We first give an example of the procedure and then give an algorithm for the 
general case. 



EXAMPLE 1 1 Sierpinski Carpet 

Figure 1 1.14.15 shows the unit square region £ Q in the xy-plane, which will serve as an "initial" set for an iterative procedure for 
the construction of the Sierpinski carpet. The set S\ in the figure is the result of mapping £ Q with each of the eight similitudes 
Ti(i = 1, 2, ..., S) in 8 that determine the Sierpinski carpet. It consists of eight square regions, each of side length -jr, surrounding 

an empty middle square. We next apply the eight similitudes to S\ and arrive at the set £ 2 . Similarly, applying the eight similitudes 
to s 2 results in the set $3. It we continue this process indefinitely, the sequence of sets ^, £ 2 , £ 3 , ... will "converge" to a set 5, 
which is the Sierpinski carpet. 
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Figure 11.14.15 



Remark Although we should properly give a definition of what it means for a sequence of sets to "converge" to a given set, an 
intuitive interpretation will suffice in this introductory treatment. 

Although we started in Figure 1 1.14.15 with the unit square region to arrive at the Sierpinski carpet, we could have started with 
any nonempty set £g. The only restriction is that the set £g be closed and bounded. For example, if we start with the particular set 
Sq shown in Figure 1 1.14.16, then ^ is the set obtained by applying each of the eight similitudes in 8. Applying the eight 
similitudes to S\ results in the set £ 2 - As before, applying the eight similitudes indefinitely yields the Sierpinski carpet S as the 
limiting set. 
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Figure 11.14.16 



The general algorithm illustrated in the preceding example is as follows: Let 7^, 7 2 > Ty ■••■> T% De contracting similitudes with the 
same scale factor, and for an arbitrary set Q in j? 2 , define the set J{Q) by 

J(Q) = Tx{Q) U7 2 (0 U7 3 (0 U-U7*(0 

The following algorithm generates a sequence of sets £ , S\ 9 • • •> £ H > • • • that converges to the set S in Theorem 1 1.14.1. 



Algorithm 1 



Step 0. Choose an arbitrary nonempty closed and bounded set £ Q in pi. 



Stepl. Compute £ 1=l 7(£ ). 



Step 2. Compute s 2 = j(S { ). 



Step 3 . Compute £3 = j (^ ) • 



Step n. Compute s„ = J(S„-i)- 



EXAMPLE 1 2 Sierpinski Triangle 



Let us construct the Sierpinski triangle determined by the three similitudes given in 10. The corresponding set mapping is 
J(Q) = T\ (0 U 72(0 u ^3(0- Fig ure 1 1-14.17 shows an arbitrary closed and bounded set £ ; the first four iterates g^, £ 2 > Sy 
Sq and the limiting set S (the Sierpinski triangle). 
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EXAMPLE 1 3 Using Algorithm 1 



Consider the following two similitudes: 



7*1 



1 
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T 2 



cos 6 — smO 
smO cos 9 



I 



The actions of these two similitudes on the unit square U are illustrated in Figure 1 1.14.18. Here, the rotation angle is a 
parameter that we shall vary to generate different self- similar sets. The self- similar sets determined by these two similitudes are 
shown in Figure 1 1.14.19 for various values of Q. For simplicity, we have not drawn the xy-axes, but in each case the origin is the 
lower left point of the set. These sets were generated on a computer using Algorithm 1 for the various values of /}. Because k = 2 
and s = -^ 9 it follows from 2 that the Hausdorff dimension of these sets for any value of is 1. It can be shown that the topological 

dimension of these sets is 1 for Q = Q and for all other values of Q. It follows that the self-similar set for Q = Q is not a fractal [it is 
the straight line segment from (0, 0) to (.6, .6)], while the self-similar sets for all other values of are fractals. In particular, they 
are examples of fractals with integer Hausdorff dimension. 
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A Monte Carlo Approach 

The set-mapping approach of constructing self- similar sets described in Algorithm 1 is rather time-consuming on a computer 
because the similitudes involved must be applied to each of the many computer screen pixels in the successive iterated sets. In 
1985 Michael Barnsley described an alternative, more practical method of generating a self similar set defined through its 
similitudes. It is a so-called Monte Carlo method that takes advantage of probability theory. Barnsley refers to it as the Random 
Iteration Algorithm. 

Let 7 1? 7*2> Ty •••> 7^ be contracting similitudes with the same scale factor. The following algorithm generates a sequence of 
points 
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that collectively converge to the set 5 in Theorem 1 1.14.1. 



Algorithm 2 



Step 0. Choose an arbitrary point 



70 



in S. 



Step 1. Choose one of the k similitudes at random, say TV, and compute 
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Step 2. Choose one of the k similitudes at random, say TV, and compute 
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yi 



= ?*: 



*1 

y\ 



Step n. Choose one of the k similitudes at random, say 7^ , and compute 



y» 



= Tk„ 



*n-l 

yn-i 



On a computer screen the pixels corresponding to the points generated by this algorithm will fill out the pixel representation of the 
limiting set S. 

Figure 1 1.14.20 shows four stages of the Random Iteration Algorithm that generate the Sierpinski carpet, starting with the initial 
point 
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Remark Although Step in the preceding algorithm requires the selection of an initial point in the set S, which may not be known 
in advance, this is not a serious problem. 



In practice, one can usually start with any point in p} and after a few iterations (say ten or so), the point generated will be 
sufficiently close to S that the algorithm will work correctly from that point on. 



More General Fractals 



So far, we have discussed fractals that are self-similar sets according to the definition of a self-similar set in p 2 . However, 
Theorem 1 1.14.1 remains true if the similitudes 7^, 7^, . . ., 7^ are replaced by more general transformations, called contracting 
affine transformations. An affine transformation is defined as follows: 



DEFINITION 


















An affine transformation is a mapping of p} into R 

1 

where a, b, c, d, e, and /are scalars. 
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Figure 1 1.14.21 shows how an affine transformation maps the unit square U onto a parallelogram T(U)- An affine transformation 
is said to be contracting if the Euclidean distance between any two points in the plane is strictly decreased after the two points are 
mapped by the transformation. It can be shown that any k contracting affine transformations 7^, 7 2 , ... 7^ determine a unique 
closed and bounded set S satisfying the equation 



S= Ti (S) u T 2 (S) u T 3 (F; u ... u T k (F; 



(13) 



Equation 13 has the same form as Equation 12, which we used to find self-similar sets. Although Equation 13, which uses 
contracting affine transformations, does not determine a self-similar set 5, the set it does determine has many of the features of 
self-similar sets. For example, Figure 1 1.14.22 shows how a set in the plane resembling a fern (an example made famous by 
Barnsley) can be generated through four contracting affine transformations. Note that the middle fern is the slightly overlapping 
union of the four smaller affine-image ferns surrounding it. Note also how 7 3 , because the determinant of its matrix part is zero, 
maps the entire fern onto the small straight line segment between the points (.50, 0) and (.50, .16). Figure 1 1.14.22 contains a 
wealth of information and should be studied carefully. 
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Figure 11.14.22 



Michael Barnsley is actively pursuing an application of the above theory to the field of data compression and transmission. The 
fern, for example, is completely determined by the four affine transformations T\, 7*2 » Ty 7*4- These four transformations, in turn, 
are determined by the 24 numbers given in Figure 11.14.22 defining their corresponding values of a, b, c, d, e, and/. In other 
words, these 24 numbers completely encode the picture of the fern. Storing these 24 numbers in a computer requires considerably 
less memory space than storing a pixel-by-pixel description of the fern. In principle, any picture represented by a pixel map on a 
computer screen can be described through a finite number of affine transformations, although it is not easy to determine which 
transformations to use. Nevertheless, once encoded, the affine transformations generally require several orders of magnitude less 
computer memory than a pixel-by-pixel description of the pixel map. 

Further Readings 

Readers interested in learning more about fractals are referred to the following books, the first of which elaborates on the linear 
transformation approach of this section. 

1. Michael Barnsley, Fractals Everywhere (New York: Academic Press, 1993). 



Benoit B. Mandelbrot, The Fractal Geometry of Nature (New York: W. H. Freeman, 1982). 



3. Heinz-OTTO Peitgen and P. H. Richter, The Beauty of Fractals (New York: Springer- Verlag, 1986). 



Heinz-OTTO Peitgen and Dietmar Saupe, The Science of Fractal Images (New York: Springer- Verlag, 1988). 



Exercise Set 11.1 4 



@ 



Click here for Just Ask! 



The self-similar set in Figure Ex-1 has the sizes indicated. Given that its lower left corner is situated at the origin of the xy 
1- -plane, find the similitudes that determine the set. What is its Hausdorff dimension? Is it a fractal? 
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Figure Ex-1 

Find the Hausdorff dimension of the self-similar set shown in Figure Ex-2. Use a ruler to measure the figure and determine an 
2. approximate value of the scale factor s. What are the rotation angles of the similitudes determining this set? 




Figure Ex-2 



For each of the self-similar sets in the accompanying figure, find: (i) the scale factor s of the similitudes describing the set; (ii) 
3. the rotation angles of all similitudes describing the set (all rotation angles are multiples of 90°; and (iii) the Hausdorff 
dimension of the set. Which of the sets are fractals and why? 
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Figure Ex-3 
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Show that of the four affine transformations shown in Figure 1 1.14.22, only the transformation 7 2 * s a similitude. Determine its 
4« scale factor s and rotation angle Q. 



5. 



Find the coordinates of the tip of the fern in Figure 1 1 . 14.22. 
Hint The transformation 7 2 maps the tip of the fern to itself. 



The square in Figure 11.14.7 a was expressed as the union of 4 nonoverlapping squares as in Figure 11.14.7Z?. Suppose that it is 
6. expressed instead as the union of 16 nonoverlapping squares. Verify that its Hausdorff dimension is still two, as determined by 
Equation 2. 



Show that the four similitudes 
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express the unit square as the union of four overlapping squares. Evaluate the right-hand side of Equation 2 for the values of 
and s determined by these similitudes, and show that the result is not the correct value of the Hausdorff dimension of the unit 



square. 

Note This exercise shows the necessity of the nonoverlapping condition in the definition of a self-similar set and its Hausdorff 
dimension. 

All of the results in this section can be extended to R n . Compute the Hausdorff dimension of the unit cube in j? 3 (see Figure 
"• Ex-8). Given that the topological dimension of the unit cube is three, determine whether it is a fractal. 




Figure Ex-8 



Hint Express the unit cube as the union of eight smaller congruent nonoverlapping cubes. 



The set in p} in the accompanying figure is called the Menger sponge. It is a self- similar set obtained by drilling out certain 
'* square holes from the unit cube. Note that each face of the Menger sponge is a Sierpinski carpet and that the holes in the 
Sierpinski carpet now run all the way through the Menger sponge. Determine the values of k and s for the Menger sponge and 
find its Hausdorff dimension. Is the Menger sponge a fractal? 




Figure Ex-9 
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determine a fractal known as the Cantor set. Starting with the unit square region U as an initial set, sketch the first four sets 
that Algorithm 1 determines. Also, find the Hausdorff dimension of the Cantor set. (This famous set was the first example that 
Hausdorff gave in his 1919 paper of a set whose Hausdorff dimension is not equal to its topological dimension.) 



11. 



Compute the areas of the sets £g, jjj, ^ ^j an d £4 i n Figure 1 1.14.15. 



Section 11.14 



a Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Use similitudes of the form 
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to show that the Menger sponge (see Exercise 9) is the set S satisfying 

for appropriately chosen similitudes 7^. (for i = 1, 2, 3, ... ? 20). Determine these similitudes by determining the collection of 
3x1 matrices 



bi 



fori =1,2. 3, _ 20 



Generalize the ideas involved in the Cantor set (in p}), the Sierpinski carpet (in p 2 ), and the Menger sponge (in p}) to R n by 
*■*• considering the set S satisfying 
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where each a^ equals 0, ± or ^-, and no two of them ever equal ir at the same time. Use a computer to construct the set 
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fori = 1, 2, 3, ..., m y 



thereby determining the value of m n for n = 2, 3, 4. Then develop an expression for m n . 
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11.15 

CHAOS 



In this section we use a map of the unit square in the xy '-plane onto itself to 
describe the concept of a chaotic mapping. 



Prerequisites: Geometry of Linear Operators on p 2 (Section 9.2) 



Eigenvalues and Eigenvectors 



Intuitive Understanding of Limits and Continuity 



Chaos 

The word chaos was first used in a mathematical sense in 1975 by Tien-Yien Li and James Yorke in a paper entitled "Period Three 
Implies Chaos." The term is now used to describe the behavior of certain mathematical mappings and physical phenomena that at 
first glance seem to behave in a random or disorderly fashion but actually have an underlying element of order (examples include 
random-number generation, shuffling cards, cardiac arrhythmia, fluttering airplane wings, changes in the red spot of Jupiter, and 
deviations in the orbit of Pluto). In this section we discuss a particular chaotic mapping called Arnold's cat map, after the Russian 
mathematician Vladimir I. Arnold who first described it using a diagram of a cat. 

Arnold's Cat Map 

To describe Arnold's cat map, we need a few ideas about modular arithmetic. If x is a real number, then the notation x mod 1 
denotes the unique number in the interval [0, 1) that differs from x by an integer. For example, 

2.3 mod 1 = 0.3, 0.9 mod 1 = 0.9, - 3.7 mod 1 = 0.3, 2.0 mod 1 = 

Note that if x is a nonnegative number, then x mod 1 is simply the fractional part of x. If (x 7 y) is an ordered pair of real numbers, 
then the notation {x,y) mod 1 denotes (x mod 1, y mod 1). For example, 

(2.3, -7.9) modi = (0.3,0.1) 
Observe that for every real number x, the point x mod 1 lies in the unit interval [0,1) and that for every ordered pair (^ y^, the 
point (x,y) mod 1 lies in the unit square 

S={(x,y)\0<x<\,0<y<\} 
Also observe that the upper boundary and the right-hand boundary of the square are not included in S. 

Arnold's cat map is the transformation f p 2 ^ ^2 defined by the formula 

F:(x,y) — * (x \-y,x I 2y) mod 1 
or, in matrix notation, 
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To understand the geometry of Arnold's cat map, it is helpful to write 1 in the factored form 

mod 1 

which expresses Arnold's cat map as the composition of a shear in the x-direction with factor 1, followed by a shear in the 
y-direction with factor 1. Because the computations are performed mod 1, r maps all points of p} into the unit square S. 

We will illustrate the effect of Arnold's cat map on the unit square 5, which is shaded in Figure 11.15.1a and contains a picture of a 
cat. It can be shown that it does not matter whether the mod 1 computations are carried out after each shear or at the very end. We 
will discuss both methods, first performing them at the end. The steps are as follows: 
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Figure 11.15.1 
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Step 3: 
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Step 1 . Shear in the x-direction with factor 1 (Figure 11.15. 1Z?): 



(x f y) — > (x+y,y) 



or in matrix notation 



1 1 
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Step 2 . Shear in the ^-direction with factor 1 (Figure 11.15.1c): 

(x,y) > (x ? x I y) 

or, in matrix notation, 
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Step 3. Reassembly into S (Figure 1 1 . 15. Id): 

{x 7 y) > (x ?i y)modl 

The geometric effect of the mod 1 arithmetic is to break up the parallelogram in Figure 11.15.1c and 
reassemble the pieces of S as shown in Figure 11. 15. Id. 

For computer implementation, it is more convenient to perform the mod 1 arithmetic at each step, rather than at the end. With this 
approach there is a reassembly at each step, but the net effect is the same. The steps are as follows: 



Step 1. Shear in the x-direction with factor 1, followed by a reassembly into S (Figure 1 1.15.2&): 

(x,y) — » (x I ,y,.y) modi 

Step 2. Shear in the j-direction with factor 1, followed by a reassembly into S (Figure 1 1.15.2c): 

(x,y) > (x,x I y) modi 
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Figure 11.15.2 

Repeated Mappings 

Chaotic mappings such as Arnold's cat map usually arise in physical models in which an operation is performed repeatedly. For 
example, cards are mixed by repeated shuffles, paint is mixed by repeated stirs, water in a tidal basin is mixed by repeated tidal 
changes, and so forth. Thus, we are interested in examining the effect on S of repeated applications (or iterations) of Arnold's cat 
map. Figure 1 1.15.3, which was generated on a computer, shows the effect of 25 iterations of Arnold's cat map on the cat in the 
unit square S. Two interesting phenomena occur: 

* 
The cat returns to its original form at the 25th iteration. 

* 
At some of the intermediate iterations, the cat is decomposed into streaks that seem to have a specific direction. 



Much of the remainder of this section is devoted to explaining these phenomena. 
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Figure 11.15.3 



Our first goal is to explain why the cat in Figure 1 1.15.3 returns to its original con. guration at the 25th iteration. For this purpose it 
will be helpful to think of a picture in the xy-plane as an assignment of colors to the points in the plane. For pictures generated on 
a computer screen or other digital device, hardware limitations require that a picture be broken up into discrete squares, called 
pixels. For example, in the computer- generated pictures in Figure 1 1.15.3 the unit square S is divided into a grid with 101 pixels on 
a side for a total of 10,201 pixels, each of which is black or white (Figure 1 1.15.4). An assignment of colors to pixels to create a 
picture is called a pixel map. 
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Figure 11.15.4 



As shown in Figure 1 1.15.5, each pixel in S can be assigned a unique pair of coordinates of the form 0^/101,^/101) that 
identifies its lower left-hand corner, where m and n are integers in the range 0, 1,2, . . ., 100. We call these points pixel points 



because each such point identifies a unique pixel. Instead of restricting the discussion to the case where S is subdivided into an 
array with 101 pixels on a side, let us consider the more general case where there are/? pixels per side. Thus, each pixel map in S 
consists of p pixels uniformly spaced 1 / p units apart in both the x- and the j-directions. The pixel points in S have coordinates of 
the form (m I p, n I p) where m and n are integers ranging from to p — 1. 
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Figure 11.15.5 
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Under Arnold's cat map each pixel point of S is transformed into another pixel point of S. To see why this is so, observe that the 
image of the pixel point (m I p, n I p) under r is given in matrix form by 



p 


\ 


"1 r 
1 2 


' m_ ' 
P 
n_ 


mod 1 = 
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m I n 

P 
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mod 1 



(2) 



The ordered pair ((^ | n) I p,{m I 2^) / p) mod 1 is of the form (W / p 9 n* / p), where m r and #' lie in the range 0, 1,2, . . ., 
p — 1. Specifically, m* and «' are the remainders when m \ ^and^ | 2^ are divided by p, respectively. Consequently, each point 
in S of the form (m I p, n I p) is mapped onto another point of the same form. 

Because Arnold's cat map transforms every pixel point of S into another pixel point of 5, and because there are only p different 
pixel points in 5, it follows that any given pixel point must return to its original position after at most p iterations of Arnold's cat 
map. 



EXAMPLE 1 Using Formula 2 



If p = 76, then 2 becomes 
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In this case the successive iterates of the point — , — are 
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(verify). Because the point returns to its initial position on the ninth application of Arnold's cat map (but no sooner), the point is 
said to have period 9, and the set of nine distinct iterates of the point is called a 9-cycle. Figure 1 1.15.6 shows this 9-cycle with the 
initial point labeled and its successive iterates labeled accordingly. 




Figure 11.15.6 



In general, a point that returns to its initial position after n applications of Arnold's cat map, but does not return with fewer than n 
applications, is said to have period n, and its set of n distinct iterates is called an n-cycle. Arnold's cat map maps (0, 0) into (0, 0), 
so this point has period 1. Points with period 1 are also called fixed points. We leave it as an exercise (Exercise 1 1) to show that 
(0, 0) is the only fixed point of Arnold's cat map. 

Period versus Pixel Width 

If j p 1 and p 2 are points with periods q\ and q^ respectively, then j p 1 returns to its initial position inq\ iterations (but no sooner), 
and p 2 returns to its initial position in #2 iterations (but no sooner); thus, both points return to their initial positions in any number 
of iterations that is a multiple of both q \ and q^ In general, for a pixel map with p pixel points of the form (mi p,ni p), we let 
n(j?) denote the least common multiple of the periods of all the pixel points in the map [that is, YL(p) is the smallest integer that is 
divisible by all of the periods]. It follows that the pixel map will return to its initial configuration in TL(p) iterations of Arnold's cat 
map (but no sooner). For this reason, we call TI(p) the period of the pixel map. In Exercise 4 we ask the reader to show that if 
p= 101, then all pixel points have period 1,5, or 25, son(101)=25- This explains why the cat in Figure 1 1.15.3 returned to its 
initial configuration in 25 iterations. 

Figure 1 1 . 15.7 shows how the period of a pixel map varies with p. Although the general tendency is for the period to increase as p 
increases, there is a surprising amount of irregularity in the graph. Indeed, there is no simple function that specifies this 
relationship (see Exercise 1). 
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Figure 11.15.7 



Although a pixel map with/? pixels on a side does not return to its initial configuration until TL(p) iterations have occurred, various 
unexpected things can occur at intermediate iterations. For example, Figure 1 1.15.8 shows a pixel map with p = 250 of the famous 
Hungarian- American mathematician John von Neumann. It can be shown that n(250) = 750; hence, the pixel map will return to 
its initial configuration after 750 iterations of Arnold's cat map (but no sooner). However, after 375 iterations the pixel map is 
turned upside down, and after another 375 iterations (for a total of 750) the pixel map is returned to its initial configuration. 



Moreover, there are so many pixel points with periods that divide 750 that multiple ghostlike images of the original likeness occur 
at intermediate iterations; at 195 iterations numerous miniatures of the original likeness occur in diagonal rows. 
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Figure 11.15.8 

The Tiled Plane 

Our next objective is to explain the cause of the linear streaks that occur in Figure 1 1.15.3. For this purpose it will be helpful to 
view Arnold's cat map another way. As defined, Arnold's cat map is not a linear transformation because of the mod 1 arithmetic. 
However, there is an alternative way of defining Arnold's cat map that avoids the mod 1 arithmetic and results in a linear 
transformation. For this purpose, imagine that the unit square S with its picture of the cat is a "tile," and suppose that the entire 
plane is covered with such tiles, as in Figure 1 1.15.9. We say that the xy-plane has been tiled with the unit square. If we apply the 
matrix transformation in 1 to the entire tiled plane without performing the mod 1 arithmetic, then it can be shown that the portion 
of the image within S will be identical to the image that we obtained using the mod 1 arithmetic (Figure 1 1.15.9). In short, the 
tiling results in the same pixel map in S as the mod 1 arithmetic, but in the tiled case Arnold's cat map is a linear transformation. 
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Figure 11.15.9 



It is important to understand, however, that tiling and mod 1 arithmetic produce periodicity in different ways. If a pixel map in S 
has period n, then in the case of mod 1 arithmetic, each point returns to its original position at the end of n iterations. In the case of 
tiling, points need not return to their original positions; rather, each point is replaced by a point of the same color at the end of n 



iterations. 



Properties of Arnold's Cat Map 

To understand the cause of the streaks in Figure 1 1 .15.3, think of Arnold's cat map as a linear transformation on the tiled plane. 
Observe that the matrix 



C = 



that defines Arnold's cat map is symmetric and has a determinant of 1 . The fact that the determinant is 1 means that multiplication 
by this matrix preserves areas; that is, the area of any figure in the plane and the area of its image are the same. This is also true for 
figures in S in the case of mod 1 arithmetic, since the effect of the mod 1 arithmetic is to cut up the figure and reassemble the 
pieces without any overlap, as shown in Figure 1 1 . 15. Id. Thus, in Figure 1 1 . 15.3 the area of the cat (whatever it is) is the same as 
the total area of the blotches in each iteration. 



The fact that the matrix is symmetric means that its eigenvalues are real and the corresponding eigenvectors are perpendicular. We 
leave it for the reader to show that the eigenvalues and corresponding eigenvectors of C are 



Ai = 



vi 



3+/? 



= 2.6180. 



A 2 = 
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= 0.3819.. 

2 
1 



1.6180.. 
1 



For each application of Arnold's cat map, the eigenvalue \± causes a stretching in the direction of the eigenvector w\ by a factor of 
2.6180 . . ., and the eigenvalue \ 2 causes a compression in the direction of the eigenvector V2 by a factor of 0.3819 .... Figure 
11.15.10 shows a square centered at the origin whose sides are parallel to the two eigenvector directions. Under the above 
mapping, this square is deformed into the rectangle whose sides are also parallel to the two eigenvector directions. The area of the 
square and rectangle are the same. 
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To explain the cause of the streaks in Figure 1 1.15.3, consider S to be part of the tiled plane, and let/; be a point of S with period n. 
Because we are considering tiling, there is a point q in the plane with the same color as/? that on successive iterations moves 
toward the position initially occupied by/;, reaching that position on the nth iteration. This point is q = [A _1 ) p = A ~ H p, since 

Thus, with successive iterations, points of S flow away from their initial positions, while at the same time other points in the plane 
(with corresponding colors) flow toward those initial positions, completing their trip on the final iteration of the cycle. Figure 

11.15.11 illustrates this in the case where « = 4, <l = —■=■,■=■ L and p = j4 q = — — . Note that p mod 1 = qmod 1 = — — , so 

both points occupy the same positions on their respective tiles. The outgoing point moves in the general direction of the 
eigenvector vi, as indicated by the arrows in Figure 11.15.11, and the incoming point moves in the general direction of eigenvector 
V2- It is the "flow lines" in the general directions of the eigenvectors that form the streaks in Figure 1 1.15.3. 
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Figure 11.15.11 



Thus far we have considered the effect of Arnold's cat map on pixel points of the form (m! p,n! p) for an arbitrary positive 
integer p. We know that all such points are periodic. We now consider the effect of Arnold's cat map on an arbitrary point (a, b) in 
S. We classify such points as rational if the coordinates a and b are both rational numbers, and irrational if at least one of the 
coordinates is irrational. Every rational point is periodic, since it is a pixel point for a suitable choice of p. For example, the 
rational point (^ / s\i ^2 / ^ can ^ e wr i tten as G"i ff 2 / s \ s 2> r 2 s \ / ^1^2^' so ^ * s a pi xe l point with p = s\S2- It can be shown 
(Exercise 13) that the converse is also true: Every periodic point must be a rational point. 



It follows from the preceding discussion that the irrational points in S are nonperiodic, so that successive iterates of an irrational 
point Org, yg) i n S must all be distinct points in S. Figure 11.15.12, which was computer-generated, shows an irrational point and 
selected iterates up to 100,000. For the particular irrational point that we selected, the iterates do not seem to cluster in any 
particular region of 5; rather, they appear to be spread throughout 5, becoming denser with successive iterations. 
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The behavior of the iterates in Figure 1 1.15.12 is sufficiently important that there is some terminology associated with it. We say 
that a set D of points in S is dense in S if every circle centered at any point of S encloses points of D, no matter how small the 
radius of the circle is taken (Figure 11.15.13). It can be shown that the rational points are dense in S and the iterates of most (but 



not all) of the irrational points are dense in S. 
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Figure 11.15.13 



Definition of Chaos 



We know that under Arnold's cat map, the rational points of S are periodic and dense in S and that some but not all of the irrational 
points have iterates that are dense in S. These are the basic ingredients of chaos. There are several definitions of chaos in current 
use, but the following one, which is an outgrowth of a definition introduced by Robert L. Devaney in 1986 in his book An 
Introduction to Chaotic Dynamical Systems (Benjamin/Cummings Publishing Company), is most closely related to our work. 



DEFINITION 




A mapping T of S onto itself is said to be chaotic if: 


(i) S contains a dense set of periodic points of the mapping T. 


(ii) There is a point in S whose iterates under T are dense in S. 



Thus Arnold's cat map satisfies the definition of a chaotic mapping. What is noteworthy about this definition is that a chaotic 
mapping exhibits an element of order and an element of disorder — the periodic points move regularly in cycles, but the points with 
dense iterates move irregularly, often obscuring the regularity of the periodic points. This fusion of order and disorder 
characterizes chaotic mappings. 

Dynamical Systems 

Chaotic mappings arise in the study of dynamical systems. Informally stated, a dynamical system can be viewed as a system that 
has a specific state or configuration at each point of time but that changes its state with time. Chemical systems, ecological 
systems, electrical systems, biological systems, economic systems, and so forth can be looked at in this way. In a discrete-time 
dynamical system, the state changes at discrete points of time rather than at each instant. In a discrete-time chaotic dynamical 
system, each state results from a chaotic mapping of the preceding state. For example, if one imagines that Arnold's cat map is 
applied at discrete points of time, then the pixel maps in Figure 1 1.15.3 can be viewed as the evolution of a discrete-time chaotic 
dynamical system from some initial set of states (each point of the cat is a single initial state) to successive sets of states. 

One of the fundamental problems in the study of dynamical systems is to predict future states of the system from a known initial 
state. In practice, however, the exact initial state is rarely known because of errors in the devices used to measure the initial state. It 
was believed at one time that if the measuring devices were sufficiently accurate and the computers used to perform the iteration 
were sufficiently powerful, then one could predict the future states of the system to any degree of accuracy. But the discovery of 
chaotic systems shattered this belief because it was found that for such systems the slightest error in measuring the initial state or in 
the computation of the iterates becomes magnified exponentially, thereby preventing an accurate prediction of future states. Let us 
demonstrate this sensitivity to initial conditions with Arnold's cat map. 



Suppose that p^ is a point in the xy-plane whose exact coordinates are (0.77837, 0.70904). A measurement error of 0.00001 is 
made in the y-coordinate, such that the point is thought to be located at (0.77837, 0.70905), which we denote by g Q . Both p^ and 
(3q are pixel points with p = 100,000 (why?), and thus, since 11(1 00,000) = 75,000, both return to their initial positions after 
75,000 iterations. In Figure 1 1.15.14 we show the first 50 iterates of p^ under Arnold's cat map as crosses and the first 50 iterates 
of g as circles. Although p^ and g are close enough that their symbols overlap initially, only their first eight iterates have 
overlapping symbols; from the ninth iteration on their iterates follow divergent paths. 



6-,, + ° s 

+ .% ° 








* + - * 


+ • - 


p CO 


* qO ♦ 


8 


■i a ° v 


o , o ° ° 


+ 


o o© 


+ o °o * 




' + §* 


■1 o «. 


D 



Figure 11.15.14 

It is possible to quantify the growth of the error from the eigenvalues and eigenvectors of Arnold's cat map. For this purpose we 
shall think of Arnold's cat map as a linear transformation on the tiled plane. Recall from Figure 11.15.10 and the related discussion 
that the projected distance between two points in S in the direction of the eigenvector vi increases by a factor of 2.618 CL.( = Ai) 
with each iteration (Figure 11.15.15). After nine iterations this projected distance increases by a factor of (2.6180...) = 5777.99..., 

and with an initial error of roughly 1/100,000 in the direction of vi, this distance is 0.05777 . . ., or about -^- the width of the unit 

square S. After 12 iterations this small initial error grows to (2.6180...) 12 / 100,000 = 1.0368..., which is greater than the width of 

S. Thus, we lose complete track of the true iterates within S after 12 iterations because of the exponential growth of the initial 
error. 
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Figure 11.15.15 

Although sensitivity to initial conditions limits the ability to predict the future evolution of dynamical systems, new techniques are 
presently being investigated to describe this future evolution in alternative ways. 



Exercise Set 11.15 
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Click here for Just Ask! 



In a journal article [F. J. Dyson and H. Falk, "Period of a Discrete Cat Mapping," The American Mathematical Monthly, 99 
!• (August-September 1992), pp. 603-614] the following results concerning the nature of the function IIQ?) were established: 

(i) II O) = 3 j? if and only if p = 2-5 k fork= 1, 2,.... 

(ii) U(p) = 2p if and only if p = 5 k for k = 1, 2, .. J or p = 6 - 5 k for k = 0, 1, 2, .. J. 
(iii) n(^) < \2p / 7 for all other choices of p. 



Find n(250), n(25), 11(125), 0(30), 11(10), 0(50), 0(3750), 11(6), and n(5) . 

Find all the ^-cycles that are subsets of the 36 points in S of the form ( m / g, n / g) with m and ^ in the range 0, 1,2, 3, 4, 5. 
2. Then find n(6). 



3. (Fibonacci Shift-Register Random-Number Generator) A well-known method of generating a sequence of 
"pseudorandom" integers xg, x\, *2> *3, ... in the interval from to p — 1 is based on the following algorithm: 

(i) Pick any two integers xg and x i from the range 0, 1, 2, . . ., p — 1. 

(ii) Set * H+1 = { Xyi 4 *„_i) modp for n = 1, 2, .... 

Here x mod /? denotes the number in the interval from to p — 1 that differs from x by a multiple of p. For example, 

35 mod 9 = 8 (because 3 = 35 — 3 - 9); 36 mod 9 = (because = 36 - 4 - 9); and _ 3 mod 9 = 6 (because 6 = - 3 I 1-9). 

(a) Generate the sequence of pseudorandom numbers that results from the choices p = 1 5, ^ = 3, and x\=l un til the 
sequence starts repeating. 

(b) Show that the following formula is equivalent to step (ii) of the algorithm: 






1 2 

1 2 



*H-1 



mod;? fortf = 1, 2, 3, , 



(c) Use the formula in part (b) to generate the sequence of vectors for the choices p = 2l, X q = 5> an d x\ = 5 un til the 
sequence starts repeating. 



Remark If we take p = l and pick xg and x\ from the interval [0, 1), then the above random-number generator produces 
pseudorandom numbers in the interval [0, 1). The resulting- scheme is precisely Arnold's cat map. Furthermore, if we eliminate 
the modular arithmetic in the algorithm and take j — x ^ — \, then the resulting sequence of integers is the famous Fibonacci 
sequence, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, . . ., in which each number after the first two is the sum of the preceding two 
numbers. 



ForC = 



1 1 
1 2 



it can be verified that 



c 25 = 



7,778,742,049 12,586,269,025 
12,586,269,025 20,365,011,074 



It can also be verified that 12,586,269,025 is divisible by 101 and that when 7,778,742,049 and 20,365,01 1,074 are divided by 
101, the remainder is 1. 

(a) Show that every point in S of the form Gm/101>h/101) returns to its starting position after 25 iterations under Arnold's 
cat map. 



(b) Show that every point in S of the form 0m/101»«/101) has period 1, 5, or 25. 



W Show that the point (—±— 9 1 has period greater than 5 by iterating it five times. 



(d) Show that n(101) = 25- 



. Show that for the mapping T: S — * S defined by T(x r y) = \x + — , y\ mod 1, every point in S is a periodic point. Why 



does this show that the mapping is not chaotic? 

An Anosov automorphism on j? 2 is a mapping from the unit square S onto S of the form 

mod 1 



a b 
c d 



in which (i) a, b, c, and d are integers, (ii) the determinant of the matrix is j_ ], and (iii) the eigenvalues of the matrix do not 
have magnitude 1. It can be shown that all Anosov automorphisms are chaotic mappings. 

(a) Show that Arnold's cat map is an Anosov automorphism. 

(b) Which of the following are the matrices of an Anosov automorphism? 



"0 f 
_1 0_ 


' 


"3 2 
1 1_ 


> 


"1 0" 
1_ 


' 


"5 7" 
2 3_ 


> 


"6 2 
5 2_ 



(c) Show that the following mapping of S onto S is not an Anosov automorphism. 



1 
-1 



mod 1 



What is the geometric effect of this transformation on S? Use your observation to show that the 
mapping is not a chaotic mapping by showing that all points in S are periodic points. 



7. 



Show that Arnold's cat map is one-to-one over the unit square S and that its range is S. 



8. 



Show that the inverse of Arnold's cat map is given by 
T~ (x,y) = (2x—y, -x I y) mod 1 



9. 



Show that the unit square S can be partitioned into four triangular regions on each of which Arnold's cat map is a 



transformation of the form 



where a and b need not be the same for each region. 

Hint Find the regions in S that map onto the four shaded regions of the parallelogram in Figure 1 1.1 5.1 d. 



y 


- 


"i r 

_1 2_ 


~x~ 

y 


i 


'a' 



If (xq, yo) is a point in S and (x n , y n ) is its wth iterate under Arnold's cat map, show that 

10. ' ' " ' r _ n rj JlHr , 



yn 



1 1 

1 2 



*0 

yo 



mod 1 



11. 



This result implies that the modular arithmetic need only be performed once rather than after each iteration. 
Show that (0, 0) is the only fixed point of Arnold's cat map by showing that the only solution of the equation 



JO 



1 1 

1 2 



*0 

yo 



mod 1 



with o < x < 1 and <y < 1 is * =y = 0- 

Hint For appropriate nonnegative integers, r and s, the preceding equation can be written as 



*0 
70 



1 1 

1 2 



*0 
70 



]-ra 



12. 



Find all 2-cycles of Arnold's cat map by finding all solutions of the equation 

with o < ^ < 1 and o <^ < 1- 

Hint For appropriate nonnegative integers, r and s, the preceding equation can be written as 



*o 

70 



2 3 

3 5 



*o 
yo 



-ra 



Show that every periodic point of Arnold's cat map must be a rational point by showing that for all solutions of the equation 

13. " r „_ n r . .n«r rn - 

*0 



*0 

yo 



l l 
l 2 



yo 



mod 1 



the numbers x$ and yo are quotients of integers. 



Section 11.15 



® 



Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



The methods of Exercise 4 show that for the cat map, TL(p) is the smallest integer satisfying the equation 



1 1 
1 2 



n&> 



modp — 



1 
1 



This suggests that one way to determine U(p) is to compute 



1 1 
1 2 



mod;? 



starting with « = 1 and stopping when this produces the identity matrix. Use this idea to compute TL(p) for p = 2, 3, ..., 1 ID- 
Compare your results to the formulas given in Exercise Tl, if they apply. What can you conjecture about 



1 1 
1 2 



fnw 



modp 



when TL(p) is even? 



T2. 



The eigenvalues and eigenvectors for the cat map matrix 



C = 



1 1 
1 2 



are 



-2+1L y 3-l/S 



A 2 



vi 



1 + fi 
2 



V2 : 



2-£ 

2 



Using these eigenvalues and eigenvectors, we can define 

2+JL o 



D = 







2 



and P = 



1 1 

l + l/5 1 — 1 /5 



and write c = PDP~^ '■> hence, C n = PD n P~^- Use a computer to show that 



c ll c 12 

(») («) 
c 21 c 22 



where 






{ \Af5 \ (2-f5 \ yi f l-/5 V 3 + /5 f 



2^5 



2{l 



and 



(")_ (")_ 1 
c \2 — c 2\— fr' 



2^5 



f3l,/5 \" / 3-^ 



How can you use these results and your conclusions in Exercise Tl to simplify the method for computing II(^)? 
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m a a /> In this section we present a method of encoding and decoding messages. We 

I I ■ I O a i so examine modular arithmetic and show how Gaussian elimination can 

CRYPTOG RAPH Y sometimes be used to break an opponent's code. 



Prerequisites : Matrices 

Gaussian Elimination 

Matrix Operations 

Linear Independence 

Linear Transformations (Sections 8.1 and 8.2) 



Ciphers 

The study of encoding and decoding secret messages is called cryptography. Although secret codes date to the earliest days of 
written communication, there has been a recent surge of interest in the subject because of the need to maintain the privacy of 
information transmitted over public lines of communication. In the language of cryptography, codes are called ciphers, uncoded 
messages are called plaintext, and coded messages are called ciphertext. The process of converting from plaintext to ciphertext is 
called enciphering, and the reverse process of converting from ciphertext to plaintext is called deciphering. 

The simplest ciphers, called substitution ciphers, are those that replace each letter of the alphabet by a different letter. For 
example, in the substitution cipher 

Plain ABCDEFGHI JKLMNOPQRS T U V W X Y 

Cipher DEFGHIJKLMNOPQRSTUVWXYZAB 

the plaintext letter A is replaced by D, the plaintext letter B by E, and so forth. With this cipher the plaintext message 

ROME WAS NOT BUILT IN A DAY 
becomes 

URPH ZDV QRWEXLOWLQ D GDB 

Hill Ciphers 

A disadvantage of substitution ciphers is that they preserve the frequencies of individual letters, making it relatively easy to break 
the code by statistical methods. One way to overcome this problem is to divide the plaintext into groups of letters and encipher the 
plaintext group by group, rather than one letter at a time. A system of cryptography in which the plaintext is divided into sets of n 
letters, each of which is replaced by a set of n cipher letters, is called a poly graphic system. In this section we will study a class of 
polygraphic systems based on matrix transformations. (The ciphers that we will discuss are called Hill ciphers after Lester S. Hill, 
who introduced them in two papers: "Cryptography in an Algebraic Alphabet," American Mathematical Monthly, 36 (June- July 
1929), pp. 306-312; and "Concerning Certain Linear Transformation Apparatus of Cryptography," American Mathematical 
Monthly, 38 (March 1931), pp. 135-154.) 

In the discussion to follow, we assume that each plaintext and ciphertext letter except Z is assigned the numerical value that 
specifies its position in the standard alphabet (Table 1). For reasons that will become clear later, Z is assigned a value of zero. 

Table 1 



ABCDEFGHIJKLMNOPQR S T U V W X 



12 3 4 5 6 7 



9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 



In the simplest Hill ciphers, successive pairs of plaintext are transformed into ciphertext by the following procedure: 



Step 1. Choose a 2 x 2 matrix with integer entries 



A = 



®2\ «22 



to perform the encoding. Certain additional conditions on A will be imposed later. 

Step 2. Group successive plaintext letters into pairs, adding an arbitrary "dummy" letter to fill out the last pair if the plaintext 
has an odd number of letters, and replace each plaintext letter by its numerical value. 



Step 3. Successively convert each plaintext pair p\p2 into a column vector 



P = 



Pi 
P2 



and form the product Ap. We will call p a plaintext vector and Ap the corresponding ciphertext 
vector. 

Step 4. Convert each ciphertext vector into its alphabetic equivalent. 



EXAMPLE 1 Hill Cipher of a Message 



Use the matrix 



to obtain the Hill cipher for the plaintext message 



1 2 
3 



I AM HIDING 



Solution 

If we group the plaintext into pairs and add the dummy letter G to fill out the last pair, we obtain 

IA MR ID IN GG 

or, equivalently, from Table 1, 

91 138 94 914 77 



To encipher the pair IJ\, we form the matrix product 

which, from Table 1, yields the ciphertext KC- 
To encipher the pair MR, we form the product 



1 2 
3 



11 

3 



1 2 
3 



13 

8 



29 
24 



(1) 



However, there is a problem here, because the number 29 has no alphabet equivalent (Table 1). To resolve this problem, we make 
the following agreement: 

Whenever an integer greater than 25 occurs, it will be replaced by the remainder that results when this integer is 
divided by 26. 

Because the remainder after division by 26 is one of the integers 0, 1,2, ...,25, this procedure will always yield an integer with an 
alphabet equivalent. 

Thus, in 1 we replace 29 by 3, which is the remainder after dividing 29 by 26. It now follows from Table 1 that the ciphertext for 
the pair MR is CX- 

The computations for the remaining ciphertext vectors are 



"1 2" 


[9" 




"17" 


3_ 


Ia 




_12_ 


1 2l" 


" 9 " 




"37" 


o 3J 


14_ 




_42_ 


"1 2" 


[7" 




"21" 


3 


7 




21 



or 



11 
16 



These correspond to the ciphertext pairs QL, KP, and £/£/, respectively. In summary, the entire ciphertext message is 

KC CX QL KP UU 

which would usually be transmitted as a single string without spaces: 

KCCXQLKPUU 



Because the plaintext was grouped in pairs and enciphered by a 2 x 2 matrix, the Hill cipher in Example 1 is referred to as a Hill 
2-cipher. It is obviously also possible to group the plaintext in triples and encipher by a 3 x 3 matrix with integer entries; this is 
called a Hill ^-cipher. In general, for a Hill n-cipher, plaintext is grouped into sets of n letters and enciphered by an n x n matrix 
with integer entries. 

Modular Arithmetic 

In Example 1, integers greater than 25 were replaced by their remainders after division by 26. This technique of working with 
remainders is at the core of a body of mathematics called modular arithmetic. Because of its importance in cryptography, we will 
digress for a moment to touch on some of the main ideas in this area. 

In modular arithmetic we are given a positive integer m, called the modulus, and any two integers whose difference is an integer 
multiple of the modulus are regarded as "equal" or "equivalent" with respect to the modulus. More precisely, we make the 
following definition. 



DEFINITION 



If m is a positive integer and a and b are any integers, then we say that a is equivalent to b modulo m, written 

a = b (modra) 

if a _ £ is an integer multiple of m. 



EXAMPLE 2 Various Equivalences 



7 = 2 (mod 5) 

19 = 3 (mod 2) 

-1 = 25 (mod 26) 

12 = (mod4) 

For any modulus m it can be proved that every integer a is equivalent, modulo m, to exactly one of the integers 

0,1,2 m-\ 

We call this integer the residue of a modulo m, and we write 

Z m = {0, \,2,...,m-\) 

to denote the set of residues modulo m. 

If a is a nonnegative integer, then its residue modulo m is simply the remainder that results when a is divided by m. For an 
arbitrary integer a, the residue can be found using the following theorem. 

THEOREM 11.16.1 



For any integer a and modulus m, let 



M 



R = remainder of 

m 



Then the residue r of a modulo m is given by 

'R ifa>0 



r=(m-R ifa<0 and R*0 
ifa<0 and R=0 



EXAMPLE 3 Residues mod 26 



Find the residue modulo 26 of (a) 87, (b) -38, and (c) -26. 

Solution (a) 

Dividing |87| = 87 by 26 yields a remainder of R = 9, so r = 9- Thus, 

87 = 9 (mod 26) 

Solution (b) 

Dividing | — 38 1 = 38 by 26 yields a remainder of R = 1 2, so r = 26 — 1 2 = 14- Thus, 

-38 = 14 (mod 26) 

Solution (c) 

Dividing | — 26| = 26 by 26 yields a remainder of R = Q. Thus, 



-26 = (mod 26) 

In ordinary arithmetic every nonzero number a has a reciprocal or multiplicative inverse, denoted by a _1 , such that 

aa~ =a~ t 
In modular arithmetic we have the following corresponding concept: 



aa =a a = l 



DEFINITION 



If a is a number in jj , then a number fl _1 in z m is called a reciprocal or multiplicative inverse of a modulo m if 

aa~ =a~ a = 1 (modwa). 



It can be proved that if a and m have no common prime factors, then a has a unique reciprocal modulo m; conversely, if a and m 
have a common prime factor, then a has no reciprocal modulo m. 



EXAMPLE 4 Reciprocal of 3 mod 26 



The number 3 has a reciprocal modulo 26 because 3 and 26 have no common prime factors. This reciprocal can be obtained by 
finding the number x in ^ that satisfies the modular equation 

3x=\ (mod 26) 

Although there are general methods for solving such modular equations, it would take us too far afield to study them. However, 
because 26 is relatively small, this equation can be solved by trying the possible solutions, to 25, one at a time. With this 
approach we find that x = 9 is the solution, because 

3-9 = 27 = 1 (mod26) 

Thus, 



3 -i = 9 (mod 26) 



EXAMPLE 5 A Number with No Reciprocal mod 26 



The number 4 has no reciprocal modulo 26, because 4 and 26 have 2 as a common prime factor (see Exercise 8). 



For future reference, we provide the following table of reciprocals modulo 26: 

Table 2 

Reciprocals Modulo 26 



Deciphering 



a 


1 


3 


5 


7 


9 


11 


15 


17 


19 


21 


23 


25 


a^ 


1 


9 


21 


15 


3 


19 


7 


23 


11 


5 


17 


25 



Every useful cipher must have a procedure for decipherment. In the case of a Hill cipher, decipherment uses the inverse (mod 26) 
of the enciphering matrix. To be precise, if m is a positive integer, then a square matrix A with entries in z m is said to be invertible 
modulo m if there is a matrix B with entries in 7 such that 



'm 



AB = BA = I (modra) 



Suppose now that 



,4 = 



a n a n 
&2\ ^22 



is invertible modulo 26 and this matrix is used in a Hill 2-cipher. If 



is a plaintext vector, then 

is the corresponding ciphertext vector and 



Pi 
P2 



c = A$ (mod 26) 



p = ,4 l t (mod 26) 



Thus, each plaintext vector can be recovered from the corresponding ciphertext vector by multiplying it on the left by 

,4 _1 (mod 26). 

In cryptography it is important to know which matrices are invertible modulo 26 and how to obtain their inverses. We now 
investigate these questions. 

In ordinary arithmetic, a square matrix A is invertible if and only if det(A) ± 0, or, equivalently, if and only if det(^4) has a 
reciprocal. The following theorem is the analog of this result in modular arithmetic. 



THEOREM 11.16.2 



A square matrix A with entries 
modulo m. 




is invertible modulo 


m if and only if the residue ofdet(A) 


modulo m has a 


reciprocal 



Because the residue of det(^4) modulo m will have a reciprocal modulo m if and only if this residue and m have no common prime 
factors, we have the following corollary. 



COROLLARY 11.16.3 



A square matrix A with entries 
common prime factors. 


in Z™ 


is invertible modulo 


m if and only ifm 


and the residue 


ofdet(A) 


modulo 


m have 


no 



Because the only prime factors of m = 26 are 2 and 13, we have the following corollary, which is useful in cryptography. 



COROLLARY 11.16.4 



We leave it for the reader to verify that if 



,4 = 



a h 
c d 



has entries in z^t an d the residue of <Jet(j4) =ad — be modulo 26 is not divisible by 2 or 13, then the inverse of A (mod 26) is 
given by 



A~ l = (ad-bc)~ { 



d -b 
— c a 



(mod 26) 



,-1 



where (ad — be) is the reciprocal of the residue of a d — be (mod 26). 



(2) 



EXAMPLE 6 Inverse of a Matrix mod 26 



Find the inverse of 



modulo 26. 



Solution 



A = 



5 6 
2 3 



so from Table 2, 



Thus, from 2, 



As a check, 



dtt(A) =ad -be = 5-3-6-2 = 3 



(ad-bc)~ 1 = 3~ 1 = 9 (mod 26) 



A' 1 = 9 


3 -6' 




27 -54" 




"1 24" 


_-2 5 




-13 45 




_8 19_ 


„ H — 1 


H fi l 


[1 24" 




[53 234] 




[1 0] 


AA = 






= 




= 






I 2 3 \ 


[S 19 




[26 105j 




L° ] J 



(mod 26) 



(mod 26) 



Similarly, A ~ l A = I 



EXAMPLE 7 Decoding a Hill 2-Cipher 



Decode the following Hill 2-cipher, which was enciphered by the matrix in Example 6: 

GTNKGKDUSK 



Solution 

From Table 1 the numerical equivalent of this ciphertext is 

720 1411 711 421 1911 
To obtain the plaintext pairs, we multiply each ciphertext vector by the inverse of A (obtained in Example 6): 



1 24 


7 




487 




19 


_8 19_ 


20 




_436_ 




_20_ 


"1 24" 


"14" 




"278" 




"18" 


_8 19_ 


_11_ 




_321_ 




_ 9 _ 



'1 24" 


"7 " 




"271" 




"11" 


_8 19_ 


_11_ 




265 




5 _ 


"1 24" 


"4 " 




"508" 




"14" 


_8 19_ 


_21_ 




_431_ 




_15_ 



1 24 
8 19 



19 
11 



283 
361 



23 
23 



(mod 26) 
(mod 26) 
(mod 26) 
(mod 26) 
(mod 26) 



From Table 1, the alphabet equivalents of these vectors are 

ST RI KE NO WW 
which yields the message 

STRIKE NOW 

* 

Breaking a Hill Cipher 

Because the purpose of enciphering messages and information is to prevent "opponents" from learning their contents, 
cryptographers are concerned with the security of their ciphers — that is, how readily they can be broken (deciphered by their 
opponents). We will conclude this section by discussing one technique for breaking Hill ciphers. 

Suppose that you are able to obtain some corresponding plaintext and ciphertext from an opponent's message. For example, on 
examining some intercepted ciphertext, you may be able to deduce that the message is a letter that begins DEAR SIR. We will show 
that with a small amount of such data, it may be possible to determine the deciphering matrix of a Hill code and consequently 
obtain access to the rest of the message. 

It is a basic result in linear algebra that a linear transformation is completely determined by its values at a basis. This principle 
suggests that if we have a Hill n-cipher, and if 

P1-P2- — -Ph 

are linearly independent plaintext vectors whose corresponding ciphertext vectors 

are known, then there is enough information available to determine the matrix A and hence ^4 (mod m) . 
The following theorem, whose proof is discussed in the exercises, provides a way to do this. 



THEOREM 11.16.5 



Determining the Deciphering Matrix 

Let\%\, \}2, ..-, p H be linearly independent plaintext vectors, 
Hill n-cipher. If 


and let c 


y, ^2> •••> c H be the corresponding ciphertext vectors in a 


is the nxn matrix with row vectors jrV r f ..., p j 


P = 


T 

Pi 

P2 
Ph 


r and if 

7 



c = 


J 
J 



is the kxk matrix with row vectors c r c t mmmf c r f then the sequence of elementary row operations 
that reduces C to J transforms P to (A~ l ) T - 



This theorem tells us that to find the transpose of the deciphering matrix j[ _1 , we must find a sequence of row operations that 
reduces C to / and then perform this same sequence of operations on P. The following example illustrates a simple algorithm for 
doing this. 



EXAMPLE 8 Using Theorem 1 1 .1 6.5 



The following Hill 2-cipher is intercepted: 

IOSBTGXESPXHOPDB 
Decipher the message, given that it starts with the word DEAR. 

Solution 



I" 9 J5 4 5] 

Lis' .3 i ikJ 

r I 45 i: I51 

[l9 2 I isj 



I [9 
W 2 



12 10 



H 



12 I 



-559 227 -267 



y 



"1 


19 


t) 


5 


"1 


19 f 


_° 


1 



"1 


19 


b» 


1 


"i 





':■ 


1 



2 151 
7 l^J 

12 151 
47 »*j] 

2 feSl 
1 -1561 



147 

12 
I 






I 
17 SJ 



ifcc rwmni Hw iii«fu|* r\ 



Wc muHijrfii: J ftt firri Wttf* 



vr.. 1 pplmd. 45 hv iu rt».i Jnc jfiudulu ^e. 



Wf .|,Mo,l -- I 'J Blfcl^tn- ihtf 1 lit -■ r.nf. 1.. Ilk-^nfinil 



Wc rrp! icnI lh? enfrie* HI llipips^irulTTWl- try Ihrn 
IL-Mdu.i 1 n 1 -J i_i I >. " Id 



Wc mullipJIexl llu saciMid hrw lrt, 5 * m 21. 



We iL'pl.it-jJ iha ni'nvN 111 lliir ±c*m» ni ii~n, hv 4hci! 
in- IiJij2& 



hcaifctal Ndn^iht -»■ 



tti'c ri!pljCOll \ht MU\Cn ih ilk lil>4 ruw I - 

r»iUi» mriuki Jrfc 



From Table 1, the numerical equivalent of the known plaintext is 



DE AR 
45 1 18 

and the numerical equivalent of the corresponding ciphertext is 

10 SB 
915 192 

so the corresponding plaintext and ciphertext vectors are 



Pl = 



V2 = 



4 
_5_ 

" 1 " 
18 



-»ei = 



9 
15 



->c 2 = 



19 
2 



We want to reduce 



c = 



<2 



9 15 
19 2 



to J by elementary row operations and simultaneously apply these operations to 



P = 



Pi 

1*2 



4 5 
1 18 



to obtain (A -1 ) (the transpose of the deciphering matrix). This can be accomplished by adjoining P to 
the right of C and applying row operations to the resulting matrix [C\P] until the left side is reduced to 



I. The final matrix will then have the form [I 
Thus, 

so the deciphering matrix is 



(A ) ]. The computations can be carried out as follows: 



1 T 
(A' 1 ) = 



A~ l = 



1 
17 9 

1 17' 
9 



To decipher the message, we first group the ciphertext into pairs and find the numerical equivalent of 
each letter: 

10 SB TG XE SP XH OP DE 
915 192 207 245 1916 24 8 15 16 45 

Next, we multiply successive ciphertext vectors on the left by a~ 1 and find the alphabet equivalents of 
the resulting plaintext pairs: 

'1 17l[ 9] [4] D 

9 15 5 E 



17 
9 



19 
2 



1 
18 



"1 17" 


"20" 




" 9 " 


9_ 


_ 7_ 




_11_ 


"1 17" 


"24" 




" 5 " 


_0 9_ 


5 




19_ 



A 
R 

I 
K 

E 
S 

E 
N 

D 
T 

A 
N 

K 
S 

Finally, we construct the message from the plaintext pairs: 

DE AR IK ES EN DT 
DEAR IKE SEND TANKS 



"1 17" 


"19" 




" 5 " 


_0 9_ 


16 




14_ 


"1 17" 


"24" 




"4 " 


_0 9_ 


_ 8 




20_ 



(mod 26) 



"1 17] 


"15" 




" 1 " 


_0 9j 


16 




14_ 


"1 17" 


\ 4 ~ 




"11" 


9 


[5_ 




19_ 



AN KS 



Further Readings 

Readers interested in learning more about mathematical cryptography are referred to the following books, the first of which is 
elementary and the second more advanced. 

1. Abraham Sinkov, Elementary Cryptanalysis, a Mathematical Approach (Mathematical Association of America, 
Mathematical Library, 1966). 



2. Alan G. Konheim, Cryptography, a Primer (New York: Wiley-Interscience, 1981). 



Exercise Set 11. 16 



@ 



Click here for Just Ask! 



1. 



Obtain the Hill cipher of the message 

DARK NIGHT 
for each of the following enciphering matrices: 



(a) 



1 3 

2 1 



(b) 



4 3 
1 2 



In each part determine whether the matrix is invertible modulo 26. If so, find its inverse modulo 26 and check your work by 
2. verifying that AA~^ = A -1 A = I (mod 26). 



(a) 



A = 



Q» A = 



(c) 



A = 



(d) ._ 



9 


1 


7 


2_ 


3 


f 


5 


3 


8 


11 


1 


9 


2 


f 


1 


7 



(e) 



A = 



3 1 
6 2 



(f) 



A = 



1 8 
1 3 



Decode the message 

SAKNOXAOJX 
given that it is a Hill cipher with enciphering matrix 

"4 r 

3 2 



5. 



A Hill 2-cipher is intercepted that starts with the pairs 

SL HK 
Find the deciphering and enciphering matrices, given that the plaintext is known to start with the word ARMY. 

Decode the following Hill 2-cipher if the last four plaintext letters are known to be ATOM. 

LNGIHGYBVRENJYQO 



Decode the following Hill 3 -cipher if the first nine plaintext letters are IHAVECOME: 

HPAFQGGDUGDDHPGODYNOR 



All of the results of this section can be generalized to the case where the plaintext is a binary message; that is, it is a sequence of 
7. O's and l's. In this case we do all of our modular arithmetic using modulus 2 rather than modulus 26. Thus, for example, 
1 + 1 = (mod 2)- Suppose we want to encrypt the message 1 10101 111. Let us first break it into triplets to form the three 

as our enciphering matrix. 



vectors 



, and let us take 



"1 


1 


0" 





1 


1 


1 


1 


1 



(a) Find the encoded message. 



(b) Find the inverse modulo 2 of the enciphering matrix, and verify that it decodes your encoded message. 



If, in addition to the standard alphabet, a period, comma, and question mark were allowed, then 29 plaintext and ciphertext 
8. symbols would be available and all matrix arithmetic would be done modulo 29. Under what conditions would a matrix with 
entries in Z29 be invertible modulo 29? 



9. 



Show that the modular equation % — l (mod 26) has no solution in ^ by successively substituting the values jr = Q, 1, 2, ..., 25 



10. 



( a ) Let P and C be the matrices in Theorem 11.16.5. Show that p — Q(A _1 ) 



(b) To prove Theorem 1 1.16.5, let g^ jj 2 , . . ., 5 be the elementary matrices that correspond to the row operations that 
reduce C to /, so 



11. 



E^-E 2 E 1 C = I 
Show that 

E n -.E 2 E { P=(A- l ) T 
from which it follows that the same sequence of row operations that reduces C to J converts P 

to ( J 4" 1 ) 7 . 



(a) If A is the enciphering matrix of a Hill ^-cipher, show that 

A~ l = (C~ l F) T (mod 26) 
where C and P are the matrices defined in Theorem 11.16.5. 

(b) Instead of using Theorem 1 1 . 16.5 as in the text, find the deciphering matrix ^ _1 of Example 8 by using the result in 
part (a) and Equation 2 to compute c _1 - 

Note Although this method is practical for Hill 2-ciphers, Theorem 1 1.16.5 is more efficient for Hill /z-ciphers with 
«>2- 



Section 11.16 



% Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 

Two integers that have no common factors (except 1) are said to be relatively prime. Given a positive integer n, let 
Tl. S n = {a i , a 2 , &3, . . ., a m } ' where a \ < a 2 < ^3 < -■• < a m , be the set of all positive integers less than n and relatively prime t< 

For example, if n = 9, then 

Sg= {aua 27 a^... 7 a^} = (1, 2, 4, 5, 7, 3} 

(a) Construct a table consisting of n and £ H for mi = 2, 3, ... ? 15, and then compute 

m f m \ 

H.a k and \Y1 ^k\ (modtf) 
fc=l \ft=l / 

in each case. Draw a conjecture for ^ > 15 and prove your conjecture to be true. 
Hint Use the fact that if a is relatively prime to n, then n _ a is also relatively prime to n. 

(b) Given a positive integer n and the set £ , let p n be the m x m matrix 



Pn = 



r33 r34 (3 J 



flffl-1 


3 m 


fl m 


31 


«1 


32 


3 m -3 


3 m -2 


3 m -2 


3 m -l 



so that, for example, 



P 9 = 



5 7 
7 8 



15, and then use these 



Use a computer to compute det(P H ) and det(P H )(mod^) for « = 2, 3, 
results to construct a conjecture. 

(c) Use the results of part (a) to prove your conjecture to be true. 

Hint Add the first m _ 1 rows of p^ to its last row and then use Theorem 2.3.3. What do these results imply about the 

inverse of p H ( m od ^)? 

Given a positive integer n greater than 1, the number of positive integers less than n and relatively prime to n is called the 
T2. Euler phi function of n and is denoted by qp(«). For example, p(g) == 2 since only two positive integers (1 and 5) are less than 
6 and have no common factor with 6. 

(a) Using a computer, for each value of n = 2, 3, ..., 25 compute and print out all positive integers that are less than n and 
relatively prime to n. Then use these integers to determine the values of ip(n) for « = 2, 3, ..., 25. Can you discover a 
pattern in the results? 

(b) It can be shown that if {p\, P2, Pi,---, Pm) are a ^ the distinct prime factors of n, then 

For example, since {2, 3} are the distinct prime factors of 12, we have 

„<12) = 12(l-!)(l-l) = 4 

which agrees with the fact that {1, 5, 7, 11} are the only positive integers less than 12 and 
relatively prime to 12. Using a computer, print out all the prime factors of n for « = 2, 3, ..., 25. 
Then compute r ^) using the formula above and compare it to your results in part (a). 
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11.17 

GENETICS 



In this section we investigate the propagation of an inherited trait in successive 
generations by computing powers of a matrix. 



Prerequisites: Eigenvalues and Eigenvectors 
Diagonalization of a Matrix 
Intuitive Understanding of Limits 



Inheritance Traits 

In this section we examine the inheritance of traits in animals or plants. The inherited trait under consideration is assumed to be 
governed by a set of two genes, which we designate by A and a. Under autosomal inheritance each individual in the population of 
either gender possesses two of these genes, the possible pairings being designated AA, Aa, and aa . This pair of genes is called the 
individual's genotype, and it determines how the trait controlled by the genes is manifested in the individual. For example, in 
snapdragons a set of two genes determines the color of the flower. Genotype AA produces red flowers, genotype Aa produces pink 
flowers, and genotype aa produces white flowers. In humans, eye coloration is controlled through autosomal inheritance. 
Genotypes AA and Aa have brown eyes, and genotype aa has blue eyes. In this case we say that gene A dominates gene a, or that 
gene a is recessive to gene A, because genotype Aa has the same outward trait as genotype AA> 

In addition to autosomal inheritance we will also discuss X-linked inheritance. In this type of inheritance, the male of the species 
possesses only one of the two possible genes (A or a), and the female possesses a pair of the two genes (AA, Aa, or aa ). In humans, 
color blindness, hereditary baldness, hemophilia, and muscular dystrophy, to name a few, are traits controlled by X-linked 
inheritance. 

Below we explain the manner in which the genes of the parents are passed on to their offspring for the two types of inheritance. 
We construct matrix models that give the probable genotypes of the offspring in terms of the genotypes of the parents, and we use 
these matrix models to follow the genotype distribution of a population through successive generations. 

Autosomal Inheritance 

In autosomal inheritance an individual inherits one gene from each of its parents' pairs of genes to form its own particular pair. As 
far as we know, it is a matter of chance which of the two genes a parent passes on to the offspring. Thus, if one parent is of 
genotype Aa, it is equally likely that the offspring will inherit the A gene or the a gene from that parent. If one parent is of genotype 
aa and the other parent is of genotype Aa, the offspring will always receive an a gene from the aa parent and will receive either an 
A gene or an a gene, with equal probability, from the Aa parent. Consequently, each of the offspring has equal probability of being 
genotype aa or Aa- In Table 1 we list the probabilities of the possible genotypes of the offspring for all possible combinations of 
the genotypes of the parents. 

Table 1 



Genotype of Offspring 




Genotypes of Parents 




AA-AA 


AA — Aa AA — aa Aa — Aa Aa —aa 


aa — aa 


AA 


1 


1 1 

2 4 






Genotype of Offspring 






Genotypes of Parents 






AA-AA 


AA~Aa 


AA-aa 


Aa-Aa 


Aa — aa 


aa — aa 


Aa 





1 
2 


1 


1 
2 


1 
2 





act 











1 
4 


1 
2 


1 



EXAMPLE 1 Distribution of Genotypes in a Population 



Suppose that a farmer has a large population of plants consisting of some distribution of all three possible genotypes AA Aa, and 
aa- The farmer desires to undertake a breeding program in which each plant in the population is always fertilized with a plant of 
genotype AA and is then replaced by one of its offspring. We want to derive an expression for the distribution of the three possible 
genotypes in the population after any number of generations. 

For n — 0, 1,2, ... , let us set 

a n = fraction of plants of genotype AA in wth generation 
£ = fraction of plants of genotype Aa in nth generation 
c n = fraction of plants of genotype a a in wth generation 

Thus a o, &Q' anc * co specify the initial distribution of the genotypes. We also have that 

a n -\-b n -\-c n = \ for« = 0, 1, 2, ... 

From Table 1 we can determine the genotype distribution of each generation from the genotype distribution of the preceding 
generation by the following equations: 

a n = a n _\ + -i H _i 

h n = c n -\ + -£ H _i tf = l,2, ... (1) 

For example, the first of these three equations states that all the offspring of a plant of genotype AA will be of genotype AA under 
this breeding program and that half of the offspring of a plant of genotype Aa will be of genotype Ad- 
Equations 1 can be written in matrix notation as 

x W = Mx&- { \ «=1,2_ 



(2) 



where 



>)_ 



>-l) _ 






and 



itf = 



1 


1 

2 








1 
2 


1 












Note that the three columns of the matrix M are the same as the first three columns of Table 1 . 



From Equation 2 it follows that 



x (») = ^ X C»-D = ^2 X (»-^ = .,. = M n x (U) 

Consequently, if we can find an explicit expression for M", we can use 3 to obtain an explicit expression for x ("). To find an 
explicit expression for M", we first diagonalize M. That is, we find an invertible matrix P and a diagonal matrix D such that 



(3) 



M = PDP 
With such a diagonalization, we then have (see Exercise 1) 

where 



D n = 



-1 



(4) 



M" = PD"P~ L 


for 


» = 1,2.... 




Ai ... " 


H 


"a? •• 


■ 


A 2 ... 


= 


J£ ■■ 


■ 


... A* 




0- 


. X" 



The diagonalization of M is accomplished by finding its eigenvalues and corresponding eigenvectors. These are as follows (verify): 



Eigenvalues: 
Corresponding eigenvectors: v\ = 



Ai = l, A 2 = -l A 3 = 



V2 



1 




1 


-1 


v 3 


-2 







1 



Thus, in Equation 4 we have 



D = 



Ai " 




"l o" 


A 2 


= 


o 1 


A 3 








and 



^= [vi|v 2 |v 3 ] = 



1 


1 


f 





-1 


-2 








1 



Therefore, 



x^ = PD m P- 1 x^ = 



1 


1 


1 1 





-1 


-2 








1 J 



1 





ol 





ft)" 











oj 



1 


1 


1 


"l3 " 





-1 


-2 


h 








1 


CQ_ 



or 



x ^)= 






1 1 






-ft)" '"ft) 
ft " 



^0 






Using the fact that flQ | i + CQ = l,we thus have 



«„ = i-(|)*o-(±)~*o 

b yi =(^Jb Q +f^J~ c « = 1.2.. 



(5) 



c M = 



These are explicit formulas for the fractions of the three genotypes in the nth generation of plants in terms of the initial genotype 
fractions. 

Because f -^ J tends to zero as n approaches infinity, it follows from these equations that 

a n — > 1 
b n ^ 
c n = 
as n approaches infinity. That is, in the limit all plants in the population will be genotype AA- 



EXAMPLE 2 Modifying Example 1 



We can modify Example 1 so that instead of each plant being fertilized with one of genotype AA, each plant is fertilized with a 
plant of its own genotype. Using the same notation as in Example 1, we then find 

where 



M = 



The columns of this new matrix M are the same as the columns of Table 1 corresponding to parents with genotypes AA — AA, 

Aa - Aa, and aa - aa . 



1 


1 

4 








1 
2 








1 


1 




4 





The eigenvalues of M are (verify) 



Ai = l, A 2 = l, A 3 = 



The eigenvalue Ai = 1 has multiplicity two and its corresponding eigenspace is two-dimensional. Picking two linearly independent 
eigenvectors vi and v- 2 in that eigenspace, and a single eigenvector V3 for the simple eigenvalue A3 = — , we have (verify) 



vi 



V2 : 



V 3 : 



The calculations for X C") are then 



Jto = M n J® = PD n p- 1 JV> 



Thus, 



1 

1 



1 
1 





1) 



1 


1 









2 




"fl " 





1 
2 


1 


h 




1 




^0 












2 







1 ±-{± 

2 [2 



H + l 



(*)' 



±-\± 
2 12 



H + l 



CO 



4 »=(l)*» 



i -f i 

2 U 



M + l 



i "f 1 

2 \2 



H + l 



*0 

» = 1,2, 
*0 



(6) 



In the limit, as n tends to infinity, \ -i ] — > and ^1} 



H + l 



0, so 



^0 I 2"^0 



^0 I- ^o 



Thus, fertilization of each plant with one of its own genotype produces a population that in the limit contains only genotypes AA 
and aa- 



Autosomal Recessive Diseases 

There are many genetic diseases governed by autosomal inheritance in which a normal gene A dominates an abnormal gene a. 
Genotype AA is a normal individual; genotype Az is a carrier of the disease but is not afflicted with the disease; and genotype a a is 
afflicted with the disease. In humans such genetic diseases are often associated with a particular racial group — for instance, cystic 
fibrosis (predominant among Caucasians), sickle-cell anemia (predominant among blacks), Cooley's anemia (predominant among 
people of Mediterranean origin), and Tay-Sachs disease (predominant among Eastern European Jews). 

Suppose that an animal breeder has a population of animals that carries an autosomal recessive disease. Suppose further that those 
animals afflicted with the disease do not survive to maturity. One possible way to control such a disease is for the breeder to 
always mate a female, regardless of her genotype, with a normal male. In this way, all future offspring will either have a normal 
father and a normal mother (AA — AA matings) or a normal father and a carrier mother (AA — Aa matings). There can be no 
AA — aa matings since animals of genotype a a do not survive to maturity. Under this type of mating program no future offspring 
will be afflicted with the disease, although there will still be carriers in future generations. Let us now determine the fraction of 
carriers in future generations. We set 



x^ = 






n = \,2,. 



where 



a h = fraction of population of genotype AA in nth generation 

b n = fraction of population of genotype j^ (carriers) in nth generation 

Because each offspring has at least one normal parent, we may consider the controlled mating program as one of continual mating 
with genotype AA> as in Example 1. Thus, the transition of genotype distributions from one generation to the next is governed by 
the equation 

where 



M = 



Because we know the initial distribution X <H), the distribution of genotypes in the nth generation is thus given by 
The diagonalization of M is easily carried out (see Exercise 4) and leads to 



1 1-f* 



1 


r 




2 


n 


1 




2 



"1 


1 


"1 


"1 


1 


"fl " 





-1_ 


° &) . 





-1_ 


>. 



fl 



fl I *o 



Because fl0 | £ — ], we have 



(*)' 






-ft)' 



*o 



Thus, as n tends to infinity, we have 



m = 1,2,. 



(7) 



so in the limit there will be no carriers in the population. 



From 7 we see that 



*M = "2*M-1. 



n=\,2,... 



(8) 



That is, the fraction of carriers in each generation is one-half the fraction of carriers in the preceding generation. It would be of 
interest also to investigate the propagation of carriers under random mating, when two animals mate without regard to their 
genotypes. Unfortunately, such random mating leads to nonlinear equations, and the techniques of this section are not applicable. 
However, by other techniques it can be shown that under random mating, Equation 8 is replaced by 



b n = - 



3 m-i 



1 + ^-1 



m = 1,2, 



(9) 



As a numerical example, suppose that the breeder starts with a population in which 10% of the animals are carriers. Under the 
controlled-mating program governed by Equation 8, the percentage of carriers can be reduced to 5% in one generation. But under 
random mating, Equation 9 predicts that 9.5% of the population will be carriers after one generation {^ n = 0.95 if h n -\ = 10)- I n 
addition, under controlled mating no offspring will ever be afflicted with the disease, but with random mating it can be shown that 
about 1 in 400 offspring will be born with the disease when 10% of the population are carriers. 



X-Linked Inheritance 



As mentioned in the introduction, in X-linked inheritance the male possesses one gene (A or a) and the female possesses two genes 
(AA Aa, or flfl ). The term X-linked is used because such genes are found on the X-chromosome, of which the male has one and the 
female has two. The inheritance of such genes is as follows: A male offspring receives one of his mother's two genes with equal 
probability, and a female offspring receives the one gene of her father and one of her mother's two genes with equal probability. 
Readers familiar with basic probability can verify that this type of inheritance leads to the genotype probabilities in Table 2. 

Table 2 





Genotypes of Parents (Father, Mother) 




(/LAA) 


(A.Atrl 


(/i.tffl) 


(*r,-M) 


(a, Aa 1 


if(, act) 


= 
a- 

- 

o 


i 


A 


1 


~ 





1 


1 
2 





a 





j 


1 





3 


1 


3 


AA 


1 


3 


Q 











Aa 





^ 


1 


1 


2 


(1 






cut 


6 











1 


1 



We will discuss a program of inbreeding in connection with X-linked inheritance. We begin with a male and female; select two of 
their offspring at random, one of each gender, and mate them; select two of the resulting offspring and mate them; and so forth. 
Such inbreeding is commonly performed with animals. (Among humans, such brother- sister marriages were used by the rulers of 
ancient Egypt to keep the royal line pure.) 

The original male-female pair can be one of the six types, corresponding to the six columns of Table 2: 

(AAA), (A,Aa), (A,aa), (a,AA), (a,Aa), (a,aa) 

The sibling pairs mated in each successive generation have certain probabilities of being one of these six types. To compute these 
probabilities, for M — 0, 1,2, . . ., let us set 

a h = probability sibling-pair mated in nth generation is type (A, AA) 
b ri = probability sibling-pair mated in nth generation is type (A, J 4 3 ) 
c ri = probability sibling-pair mated in nth generation is type (A, aa ) 
dn = probability sibling-pair mated in nth generation is type (a, AA) 
e H = probability sibling-pair mated in nth generation is type (a, Aa) 
f ri = probability sibling-pair mated in nth generation is type (a, aa) 



With these probabilities we form a column vector 



>)- 



From Table 2 it follows that 



b n 
c n 

d n 
In 



&> = Mx<"~ l \ 



n = 0, 1.2. 



« = 1,2, 



(10) 



where 



(A,AA) {A, Aa) {A.aa) (a.AA) UhAu) {ti.au) 



M = 



1 


1 

4 





i 





o 





1 

-1 





1 


II 






1) 





<> 





{A, A A) 





1 


1 





(A, Aa) 


(J 





1 





(A,aa) 


(.1 





<t 





(a. AA) 


1 





1 
i 





ia. Aa) 


(1 





1 

1 


1 


(a. aa) 



For example, suppose that in the (^ _ l)-st generation, the sibling pair mated is type (A, Aa)- Then their male offspring will be 
genotype A or a with equal probability, and their female offspring will be genotype AA or Aa with equal probability. Because one 
of the male offspring and one of the female offspring are chosen at random for mating, the next sibling pair will be one of type 
(j4, AA), (A, Aa), (a, AA), or ( fl? Aa) with equal probability. Thus, the second column of M contains "J-" in each of the four rows 
corresponding to these four sibling pairs. (See Exercise 9 for the remaining columns.) 



As in our previous examples, it follows from 10 that 

After lengthy calculations, the eigenvalues and eigenvectors of M turn out to be 



(11) 



vi 



V2 : 



v 5 - 



The diagonalization of M then leads to 



¥■ 


-3- 
1 


-/J)" 


*<- 


-1 


1-/5) 


*<- 


-1 
1 


1-/5) 


}<- 


-3- 


-ft 



V 3 : 



~-l~ 


2 


-1 


1 


-2 


1 



V 4 : 



*6 



*<- 


-3 
1 


h/5) 


*<- 


-1- 


-f5) 


*<- 


-1- 
1 


-f5) 


}<- 


-3 


h/5) 



1 

-6 
-3 

3 

6 

-1 



x (") =jPD » J p-l x © „ = 1,2, 



(12) 



where 



p= 



10-11 I(-3-/5) l(-3 I /5) 

2-6 1 1 

0-1-3 1(-1 I {5) I(-l-/5) 

1 3 1(-1 I ft) I(-l-/5) 

0-26 1 1 

11-1 I(-3-/5) j-C-3 I /?) 



D" = 



1 
1 





&r 







HT 



p~ L = 














[}« + 














1 


2 
3 




1 2 
3 3 





1 
3 




2 1 

3 3 





1 
8 




1 1 
4 4 





1 
24 




1 
12 


1 
12 



^r 



[{0-/5) 



i 

3 
2 
3 



_±_ __^ .jl. _±_ 



1 
24 



° 20 (5 + V^ \f 5 5^ 20 (5 ' f 5) ° 
^ 5 "/5) -|/5 -1/5 ^(5-/5) 



20 

We will not write out the matrix product in 12, as it is rather unwieldy. However, if a specific vector X (P) is given, the calculation 
for X ( H ) is not too cumbersome (see Exercise 6). 

Because the absolute values of the last four diagonal entries of D are less than 1 , we see that as n tends to infinity, 

1 o o" 



D r 



10 











And so, from Equation 12, 



M. 



10 

10 











p-^m 



Performing the matrix multiplication on the right, we obtain (verify) 



>). 



2 12 1 





d3) 



That is, in the limit all sibling pairs will be either type (A, AA) or type (a, act)- For example, if the initial parents are type (A, Ad) 
(that is, £> Q = i and fl0 = CQ = ^ = e = / = 0), then as n tends to infinity, 

2 
3 





I 
3 

2 1 

Thus, in the limit there is probability ^- that the sibling pairs will be {A, AA), and probability -jr that they will be (a,aa)- 



M. 



Exercise Set 11.17 



@ 



Click here for Just Ask! 



1. 



Show that if M = pjjp- 1 , then M n = PD™P _1 for n = 1 , 2, 



In Example 1 suppose that the plants are always fertilized with a plant of genotype j\& rather than one of genotype AA- Derive 

2. formulas for the fractions of the plants of genotypes AA> Aa> and a a i n the nih generation. Also, find the limiting genotype 
distribution as n tends to infinity. 

In Example 1 suppose that the initial plants are fertilized with genotype AA> the first generation is fertilized with genotype jfo, 

3. the second generation is fertilized with genotype AA> and this alternating pattern of fertilization is kept up. Find formulas for the 
fractions of the plants of genotypes AA> Aa> and a a i n the nth generation. 



4. 



In the section on autosomal recessive diseases, find the eigenvalues and eigenvectors of the matrix M and verify Equation 7. 



Suppose that a breeder has an animal population in which 25% of the population are carriers of an autosomal recessive disease. 
5. If the breeder allows the animals to mate irrespective of their genotype, use Equation 9 to calculate the number of generations 
required for the percentage of carriers to fall from 25% to 10%. If the breeder instead implements the controlled-mating 
program determined by Equation 8, what will the percentage of carriers be after the same number of generations? 



In the section on X-linked inheritance, suppose that the initial parents are equally likely to be of any of the six possible 
genotype parents; that is, 



M = 



Using Equation 12, calculate X C") and also calculate the limit of X C") as n tends to infinity. 

From 13 show that under X-linked inheritance with inbreeding, the probability that the limiting sibling pairs will be of type 
^ m (A, AA) is the same as the proportion of A genes in the initial population. 



In X-linked inheritance suppose that none of the females of genotype j\& survive to maturity. Under inbreeding the possible 
8. sibling pairs are then 

(A r AA), (A,aa), (a,AA), and (a, aa) 

Find the transition matrix that describes how the genotype distribution changes in one generation. 



9. 



Derive the matrix M in Equation 10 from Table 2. 



Section 11.17 



® 



Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



(a) Use a computer to verify that the eigenvalues and eigenvectors of 



M = 



1 | 
4 

o 4- o i 4- o 



0^0 
4 

o-l-oooo 

4 

o 4- i o 4- o 



000041 

4 



as given in the text are correct, 
(b) Starting with X C M ) = Mx^ n ~^ an d the assumption that 



exists, we must have 



Hm x^ = : 



Hm x^ = M lim x^^ 



or 



x = Mx 



This suggests that x can be solved directly using the equation (M-I)x = 0- Use a computer to 
solve the equation x = Mx, where 



x = 



and a \-b-\-c-\-d-\-e-\-f = \' r compare your results to Equation 13. Explain why the solution to 
(M -l)x = along with a-\-b + c + d + e + f = \ is not specific enough to determine Hm x^. 



(a) Given 



from Equation 12 and 



P = 



10-11 I(-3-/5) I(-3 I /5) 

2-6 1 1 

0-1-3 i(-l i /5) l(-l-/5) 

1 3 i(-l I /5) l(-l-/5) 

0-26 1 1 

11-1 I(-3-/5) I(-3 h/5) 





1 






















1 














lim D n = 



































































use a computer to show that 



lim M n = 



1 1 1 2 I 

3 3 3 3 









1 1 1 2 1 

3 3 3 3 



(b) Use a computer to calculate M" for « = 10, 20, 30, 40, 50, 60, 70, and then compare your results to the limit in part 
(a). 
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11.18 

AGE-SPECIFIC 
POPULATION GROWTH 



In this section we investigate, using the Leslie matrix model, the growth over 
time of a female population that is divided into age classes. We then determine 
the limiting age distribution and growth rate of the population. 



Prerequisites: Eigenvalues and Eigenvectors 
Diagonalization of a Matrix 

Intuitive Understanding of Limits 

One of the most common models of population growth used by demographers is the so-called Leslie model developed in the 
1940s. This model describes the growth of the female portion of a human or animal population. In this model the females are 
divided into age classes of equal duration. To be specific, suppose that the maximum age attained by any female in the population 
is L years (or some other time unit) and we divide the population into n age classes. Then each class is £ / n years in duration. We 
label the age classes according to Table 1. 



Table 1 






Age Class 




Age Interval 


1 




[0,Z/«] 


2 




[Lln,2L!n] 


3 




[2Lln,3Lln] 


W-l 


[(« 


-2)Lln, (n-\)L!n] 


n 




[{n-\)Lin,L} 



Suppose that we know the number of females in each of the n classes at time t = fj. In particular, let there be x ^ females in the 
first class, T © females in the second class, and so forth. With these n numbers we form a column vector: 



,<P> = 






A n 

We call this vector the initial age distribution vector. 

As time progresses, the number of females within each of the n classes changes because of three biological processes: birth, death, 
and aging. By describing these three processes quantitatively, we will see how to project the initial age distribution vector into the 
future. 



The easiest way to study the aging process is to observe the population at discrete times — say, £q» ^i» ^2»*"* ' ^k 9 ' ' " ^ e Leslie 



model requires that the duration between any two successive observation times be the same as the duration of the age intervals. 
Therefore, we set 

t\=Lf H 
£2 = 2Z In 

£k = kL /m 
With this assumption, all females in the (j + l)-st class at time ^ +1 were in the /th class at time i k . 

The birth and death processes between two successive observation times can be described by means of the following demographic 
parameters: 

ft] The average number of daughters born to each female during the time she is in the /th age class 

ij- The fraction of females in the /th age class that can be expected to survive and pass into the (j \ l)-st age 

(! = 1,2,...,«-1) class 
By their definitions, we have that 

(i) flj->0forj = 1,2, ...,n 

(ii) 0<Aj< 1 for i = 1>2, ...,«- 1 

Note that we do not allow any £ . to equal zero, because then no females would survive beyond the /th age class. We also assume 
that at least one a 3 - is positive so that some births occur. Any age class for which the corresponding value of a 3 - is positive is called 
a fertile age class. 



We next define the age distribution vector X Q$ at time ^ by 



J?D- 



(k) 



where <?0 is the number of females in the /th age class at time t k/ Now, at time t k , the females in the first age class are just those 
daughters born between times ^_ 1 and i k/ Thus, we can write 

number of [ number of 

daughters born 

} I { to females in }H 1- { 

class 2 between 
times tfc-i and£fc 



number of 

females in 

class 1 at 

time£fc 



daughters born 

} = { to females in 
class 1 between 
times tfz-i and^ 



' number of * 



daughters born 

to females in 

class « between 

times tk—i and^ 



or, mathematically, 



(k) (k-V) (k-V) (k-V) 

x± =i3l^i I "32^2 +"- + fl H J: H 



(1) 



The females in the (j \ l)-st age class (i = 1, 2, ..., m — 1) at time t k are those females in the /th class at time t k _^ who are still 
alive at time t k . Thus, 



'number of x 
females in 
^ class i I 1 at 

time f ^ 



fraction of 
females in 
^ = / class i who \f 
survive and pass 

into class i + 1 



dumber of ) 

females in 

class i at 

time^_i 



or, mathematically, 



f 2+l 



(fc-1) 



'l*i 



i = \,2, 



,«-! 



Using matrix notation, we can write Equations 1 and 2 as 



" (ft)" 

*1 




(ft) 

x 2 




(k) 
x 3 




(k) 





a\ a 2 a-z 

b x 

b 2 











" (fc-1)" 

(fc-1) 
*2 








(fc-1) 
x 3 


^H-l 





(k-Y) 



or, more compactly, 



where L is the Leslie matrix 



From Equation 3 it follows that 



X (*) = ix <*-1), k =\,2,... 



a\ <*2 «3 

6l 

I b 2 





x^=Zx® 
x®=Zx^ = Z 2 x® 
x®=Zx® = Z 3 x® 



fl„_l 


fl„ 














*n-l 






(2) 



(3) 



(4) 



(5) 



Thus, if we know the initial age distribution x © and the Leslie matrix L, we can determine the female age distribution at any later 
time. 



EXAMPLE 1 Female Age Distribution for Animals 



Suppose that the oldest age attained by the females in a certain animal population is 15 years and we divide the population into 
three age classes with equal durations of five years. Let the Leslie matrix for this population be 



L = 



If there are initially 1000 females in each of the three age classes, then from Equation 3 we have 



"o 


4 


3~ 


1 








2 









1 







4 





x<® = 



1,000 
1,000 
1,000 



*rc = Z*<& = 



"0 4 3" 

1 o 


"1,000" 
1,000 




"7,000" 

500 


o I 


1,000 




250 



& = IjF> = 



"0 4 3" 

i 


"7,000" 

500 




"2,750" 
3,500 


I 


250 




125 



"0 4 3" 

i 


"2,750" 

3,500 




"14,375" 
1,375 


I 


125 




875 



c (9 =Zx C2) = 



Thus, after 15 years there are 14,375 females between and 5 years of age, 1375 females between 5 and 10 years of age, and 875 
females between 10 and 15 years of age. 



Limiting Behavior 

Although Equation 5 gives the age distribution of the population at any time, it does not immediately give a general picture of the 
dynamics of the growth process. For this we need to investigate the eigenvalues and eigenvectors of the Leslie matrix. The 
eigenvalues of L are the roots of its characteristic polynomial. As we ask the reader to verify in Exercise 2, this characteristic 
polynomial is 

p(X) = \XI-L\ 

= X n -fliA"" 1 -a 2 b u \"~ 2 -a 3 b { b 2 X K ~ 3 a^b^-b^ 

To analyze the roots of this polynomial, it will be convenient to introduce the function 

A A 2 A 3 

Using this function, the characteristic equation p (A) = can be written (verify) 

?(A) = 1 forA*0 



A" 



(6) 



(7) 



Because all the a,- and £ . are nonnegative, we see that q (A) is monotonically decreasing for A greater than zero. Furthermore, ^(A) 

has a vertical asymptote at A = and approaches zero as \ t ,-v • Consequently, as Figure 11.18.1 indicates, there is a unique A, 

say A = \\, such that q{X\) = 1- That is, the matrix L has a unique positive eigenvalue. It can also be shown (see Exercise 3) that 
Aj has multiplicity 1; that is, Aj is not a repeated root of the characteristic equation. Although we omit the computational details, 
the reader can verify that an eigenvector corresponding to Aj is 



*1 



Al/Ai 
b 1 b 2 b 3 IX 3 l 



M-l 



(8) 



b\b 2 ™b n -\iXl 
Because Ai has multiplicity 1, its corresponding eigenspace has dimension one (Exercise 3), and so any eigenvector corresponding 



to it is some multiple of x\. We can summarize these results in the following theorem. 




THEOREM 11.18.1 



Existence of a Positive Eigenvalue 

A Leslie matrix L has a unique positive eigenvalue X\- This eigenvalue has multiplicity 1 and an eigenvector x\ all of whose 
entries are positive. 



We will now show that the long-term behavior of the age distribution of the population is determined by the positive eigenvalue \\ 
and its eigenvector xi- 

In Exercise 9 we ask the reader to prove the following result. 



THEOREM 11.18.2 



Eigenvalues of a Leslie Matrix 

If\^ is the unique positive eigenvalue of a Leslie matrix L, and \^ is any other real or complex eigenvalue ofL, then \\^\ < X\. 



For our purposes the conclusion in Theorem 1 1.18.2 is not strong enough; we need \± to satisfy |A^| < Ai- In this case Ai would be 
called the dominant eigenvalue of L. However, as the following example shows, not all Leslie matrices satisfy this condition. 



EXAMPLE 2 Leslie Matrix with No Dominant Eigenvalue 



Let 



"o 


6" 


l» 





[° 3 






L = 



Then the characteristic polynomial of L is 

^(A)= XI -L 

The eigenvalues of L are thus the solutions of \3 _ ] — namely, 



= A J -1 



, -, 1 l/3 ■ 1 1/3 ■ 

All three eigenvalues have absolute value 1, so the unique positive eigenvalue \^ — ] is not dominant. Note that this matrix has the 
property that £ 3 — /. This means that for any choice of the initial age distribution X <K>, we have 

The age distribution vector thus oscillates with a period of three time units. Such oscillations (or population waves, as they are 
called) could not occur if X\ were dominant, as we will see below. 

It is beyond the scope of this book to discuss necessary and sufficient conditions for \^ to be a dominant eigenvalue. However, we 
will state the following sufficient condition without proof. 



THEOREM 11.18.3 



Dominant Eigenvalue 

If two successive entries a^ and a ]H _i in the first row of a Leslie matrix L are nonzero, then the positive eigenvalue ofL is 
dominant. 



Thus, if the female population has two successive fertile age classes, then its Leslie matrix has a dominant eigenvalue. This is 
always the case for realistic populations if the duration of the age classes is sufficiently small. Note that in Example 2 there is only 
one fertile age class (the third), so the condition of Theorem 1 1.18.3 is not satisfied. In what follows, we always assume that the 
condition of Theorem 11.18.3 is satisfied. 



Let us assume that L is diagonalizable. This is not really necessary for the conclusions we will draw, but it does simplify the 
arguments. In this case, L has n eigenvalues, \±, A 2 > • • •» A M > not necessarily distinct, and n linearly independent eigenvectors, xi,X2» 
. . ., x n , corresponding to them. In this listing we place the dominant eigenvalue \^ first. We construct a matrix P whose columns 
are the eigenvectors of L: 

P= [xi[x 2 [x 3 |-|x H ] 

The diagonalization of L is then given by the equation 

"Ai - C 
A 2 - C 



L = P 



From this it follows that 



L k =P 







Af 

o a} 



Ah 






j-l 



0- 
for fc — l, 2, . . .. For any initial age distribution vector x ®, we then have 



lV°W 



Af 
A} 








A H 



p-l x © 




for k = 1, 2, . . .. Dividing both sides of this equation by \& and using the fact that X C^ = L k x < ^ > , we have 






i o o 

(£)' ° 















■fc) 



p-l x CD) 



(9) 



Because \\ is the dominant eigenvalue, we have |A 2 / Ai I < 1 for j = 2, 3, . . ., n. It follows that 

k 



(A 2 /AiV 



Oas£ > oo forz = 2, 3, ..., m 



Using this fact, we may take the limit of both sides of 9 to obtain 

10 0- 



lim 



-UG9 - 



0- 



p-l x CH) 



(10) 



0-0 

Let us denote the first entry of the column vector p -1 x® by the constant c. As we ask the reader to show in Exercise 4, the right 
side of 10 can be written as cx\, where c is a positive constant that depends only on the initial age distribution vector x ®. Thus, 10 
becomes 



lim 



Ll x c*) = 



°° A 



cxi 



Equation 1 1 gives us the approximation 



for large values of k. From 12 we also have 



Comparing Equations 12 and 13, we see that 



x^cAfxi 



x^^Af- 1 *! 



^^A^-D 



(11) 



(12) 



(13) 



(14) 



for large values of k. This means that for large values of time, each age distribution vector is a scalar multiple of the preceding age 
distribution vector, the scalar being the positive eigenvalue of the Leslie matrix. Consequently, the proportion of females in each 
of the age classes becomes constant. As we will see in the following example, these limiting proportions can be determined from 
the eigenvector xi- 



EXAMPLE 3 Example 1 Revisited 



The Leslie matrix in Example 1 was 



L = 



"o 


4 


3" 


1 








2 









1 







4 





Its characteristic polynomial is p(X) = A 3 — 2 A — -5-, and the reader can verify that the positive eigenvalue is Ai = — • From 8 the 

8 2 

corresponding eigenvector x\ is 



xi = 



From 14 we have 



4l/Al 
AlA 2 /A? 



1 






1 

2. 
3 
2 




" 1 
1 
3 


(*)(*) 




1 
18 



(^2 



x <*) s l x (*-l) 



for large values of fc. Hence, every five years the number of females in each of the three classes will increase by about 50%, as will 
the total number of females in the population. 



From 12 we have 



r (fc) 



-'(!)' 



1 
1 

3 

_±_ 
18 



1. 1 



Consequently, eventually the females will be distributed among the three age classes in the ratios 1 : ±; -^-. This corresponds to a 

3 18 

distribution of 72% of the females in the first age class, 24% of the females in the second age class, and 4% of the females in the 
third age class. 



EXAMPLE 4 Female Age Distribution for Humans 



In this example we use birth and death parameters from the year 1965 for Canadian females. Because few women over 50 years of 
age bear children, we restrict ourselves to the portion of the female population between and 50 years of age. The data are for 
5-year age classes, so there are a total of 10 age classes. Rather than writing out the 10 x 10 Leslie matrix in full, we list the birth 
and death parameters as follows: 



Age Interval 


[0,5) 


[5, 10) 


[10, 15) 


[15, 20) 


[20, 25) 


[25, 30) 


[30, 35) 


[35, 40) 



bi 



0.00000 0.99651 



0.00024 0.99820 



0.05861 0.99802 



0.28608 0.99729 



0.44791 0.99694 



0.36399 0.99621 



0.22259 0.99460 



0.10457 0.99184 



Age Interval 


£tj 


bt 


[40, 45) 
[45, 50) 


0.02826 
0.00240 


0.98700 



Ai = 1.07622 and x\ = 



Using numerical techniques, we can approximate the positive eigenvalue and corresponding eigenvector by 

1.00000 
0.92594 
0.85881 
0.79641 
0.73800 
0.68364 
0.63281 
0.58482 
0.53897 
0.49429 

Thus, if Canadian women continued to reproduce and die as they did in 1965, eventually every 5 years their numbers would 
increase by 7.622%. From the eigenvector x\ 9 we see that, in the limit, for every 100,000 females between and 5 years of age, 
there will be 92,594 females between 5 and 10 years of age, 85,881 females between 10 and 15 years of age, and so forth. 



Let us look again at Equation 12, which gives the age distribution vector of the population for large times: 

x^seAfxi 
Three cases arise according to the value of the positive eigenvalue \^: 



(15) 



(i) The population is eventually increasing if Ai > 1- 



(ii) The population is eventually decreasing if Ai < 1- 



(iii) The population eventually stabilizes if Xi = 1- 

The case \^ — ] is particularly interesting because it determines a population that has zero population growth. For any initial age 
distribution, the population approaches a limiting age distribution that is some multiple of the eigenvector x\. From Equations 6 
and 7, we see that \^ — \ is an eigenvalue if and only if 



The expression 



ai +^1 + ^3*1*2 + - + a n^\^2'"^n-l = 1 



R = a\ +^2^1 + a^b\b2 "I 1" i3 h^1^2" , "^h-1 



(16) 



(17) 



is called the net reproduction rate of the population. (See Exercise 5 for a demographic interpretation of R.) Thus, we can say that 
a population has zero population growth if and only if its net reproduction rate is 1. 



Exercise Set 11.1 8 



® 



Click here for Just Ask! 



Suppose that a certain animal population is divided into two age classes and has a Leslie matrix 



L = 






(a) Calculate the positive eigenvalue X^of L and the corresponding eigenvector xi- 



(b) Beginning with the initial age distribution vector 



M = 



100 




calculate x ^, x®, x®, x^, and x ®, rounding off to the nearest integer when necessary. 
(c) Calculate x ® using the exact formula x ® — £x® an d using the approximation formula X (Q ^ J^x®- 



Find the characteristic polynomial of a general Leslie matrix given by Equation 4. 



(a) Show that the positive eigenvalue X\ of a Leslie matrix is always simple. Recall that a root Ag of a polynomial q(\) is 
simple if and only if #'(Aq) ^ 0. 



(b) Show that the eigenspace corresponding to Ai has dimension 1. 



Show that the right side of Equation 10 is ex \, where c is the first entry of the column vector p~^x^- 



Show that the net reproduction rate /?, defined by 17, can be interpreted as the average number of daughters born to a single 

5. female during her expected lifetime. 

Show that a population is eventually decreasing if and only if its net reproduction rate is less than 1. Similarly, show that a 

6. population is eventually increasing if and only if its net reproduction rate is greater than 1 . 



7. 



Calculate the net reproduction rate of the animal population in Example 1 . 



8. (For Readers With a Hand Calculator) Calculate the net reproduction rate of the Canadian female population in Example 4. 



9. (For Readers Who Have Read Sections 10.1-10.3) Prove Theorem 1 1.18.2. 



Hint Write \ k — re i& , substitute into 7, take the real parts of both sides, and show that r<\\- 



Section 11.18 



g> Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



Consider the sequence of Leslie matrices 



L 2 = 




*1 



L 3 = 



L 4 = 












a 


Al 














h 














h 












a 


h 











h 






L 5 = 



0a 

bi 

b 2 

b 3 

b A 



(a) Use a computer to show that 



4 = h, 



4=i 3 , 



£4 = ^4> 



L 5 5 =I 5 , 



for a suitable choice of a in terms of h v £ 2 , ..., i M-1 . 



(b) From your results in part (a), conjecture a relationship between a and £ l9 £ 2 > • • •» i H _i that will make £" — / , where 



L n — 









• 


■ 


a 


b\ 





■ 


•■ 








h 


■ 


■■ 











a 3 ■ 


■■ 











- 


'■ A H _i 






(c) Determine an expression for ^ H (A) = \XI n — L n \ and use it to show that all eigenvalues of £ M satisfy |A| = 1 when a 
and ^j, i 2 » • • •' i M _i are related by the equation determined in part (b). 



T2. 



Consider the sequence of Leslie matrices 



Li = 



a ap 
b 



£1 = 



a ap ap 

b 
A 



L A = 



2 3 
<2 (3/? ap ap 



b 
b 






b 



L 5 = 



2 3 4 

a a_p ap ap ap 



b 

b 











b 

A 



Zjyi — 



os ap ap 

b 

A 

A 





fl? H - 


-2 


ap"' 1 


























b 








where < p < 1, < b < 1 and 1 < a- 



(a) Choose a value for n (say, w == g). For various values of a, fe, and/?, use a computer to determine the dominant 
eigenvalue of £ , and then compare your results to the value of a I hp. 



(b) Show that 



PhW = 



^n ~~ ^n 



X-bp 



which means that the eigenvalues of L n must satisfy 

A M+1 - (a + A^)A H + a (Ap) " = 
(c) Can you now provide a rough proof to explain the fact that Aj cs a ■+ bp7 



T3. 



Suppose that a population of mice has a Leslie matrix L over a 1 -month period and an initial age distribution vector x ^> gi\ 
by 

14 3 



L = 



° ni° 

^ 

-9- 

0^000 

| 

-^ 



and X ^ = 



50 
40 
30 
20 
10 
5 



(a) Compute the net reproduction rate of the population. 



(b) Compute the age distribution vector after 100 months and 101 months, and show that the vector after 101 weeks is 
approximately a scalar multiple of the vector after 100 months. 

(c) Compute the dominant eigenvalue of L and its corresponding eigenvector. How are they related to your results in part 
(b)? 

(d) Suppose you wish to control the mouse population by feeding it a substance that decreases its age-specific birthrates 
(the entries in the first row of L) by a constant fraction. What range of fractions would cause the population eventually 
to decrease? 
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11.19 

HARVESTING OF ANIMAL 
POPULATIONS 



In this section we employ the Leslie matrix model of population growth to model 
the sustainable harvesting of an animal population. We also examine the effect 
of harvesting different fractions of different age groups. 



Prerequisites: Age-specific Population Growth (Section 11.18) 



Harvesting 

In Section 1 1.18 we used the Leslie matrix model to examine the growth of a female population that was divided into discrete age 
classes. In this section, we investigate the effects of harvesting an animal population growing according to such a model. By 
harvesting we mean the removal of animals from the population. (The word harvesting is not necessarily a euphemism for 
"slaughtering"; the animals may be removed from the population for other purposes.) 

In this section we restrict ourselves to sustainable harvesting policies. By this we mean the following: 



DEFINITION 



A harvesting policy in which an animal population is periodically harvested is said to be sustainable if the yield of each harvest 
is the same and the age distribution of the population remaining after each harvest is the same. 



Thus, the animal population is not depleted by a sustainable harvesting policy; only the excess growth is removed. 

As in Section 1 1.18, we will discuss only the females of the population. If the number of males in each age class is equal to the 
number of females — a reasonable assumption for many populations — then our harvesting policies will also apply to the male 
portion of the population. 

The Harvesting Model 

Figure 1 1.19.1 illustrates the basic idea of the model. We begin with a population having a particular age distribution. It undergoes 
a growth period that will be described by the Leslie matrix. At the end of the growth period, a certain fraction of each age class is 
harvested in such a way that the unharvested population has the same age distribution as the original population. This cycle repeats 
after each harvest so that the yield is sustainable. The duration of the harvest is assumed to be short in comparison with the growth 
period so that any growth or change in the population during the harvest period can be neglected. 



Population before growth period 
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Imputation after growth period 
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Figure 11.19.1 



To describe this harvesting model mathematically, let 



x = 



*1 
*2 



be the age distribution vector of the population at the beginning of the growth period. Thus x 3 - is the number of females in the /th 
class left unharvested. As in Section 1 1.18, we require that the duration of each age class be identical with the duration of the 
growth period. For example, if the population is harvested once a year, then the population is divided into 1-year age classes. 

If L is the Leslie matrix describing the growth of the population, then the vector £, x is the age distribution vector of the population 
at the end of the growth period, immediately before the periodic harvest. Let ^ ., for j — ], 2,. . ., n, be the fraction of females from 
the /th class that is harvested. We use these n numbers to form an n x n diagonal matrix 

'hi ■*■ 

h 2 - 

H o h 3 - 

0- k„ 
which we shall call the harvesting matrix. By definition, we have 

That is, we may harvest none (h i = 0), all (^ = 1), or some fraction (0 < Aj < 1) of each of the n classes. Because the number of 
females in the /th class immediately before each harvest is the /th entry (£ x ) of the vector £ x , the /th entry of the column vector 



HLx = 



is the number of females harvested from the /th class. 

From the definition of a sustainable harvesting policy, we have 

age distribution 






at end of 
growth period 



— [harvest] = 



age distribution 
at beginning of 
growth period 



or, mathematically, 



If we write Equation 1 in the form 



Lx — HLx: = x 



(I-H)Lx = x 



(1) 



(2) 



we see that x must be an eigenvector of the matrix (/ _ //)£ corresponding to the eigenvalue 1. As we will now show, this places 
certain restrictions on the values of A- andx. 



Suppose that the Leslie matrix of the population is 



Then the matrix (I — H)L is (verify) 



(I-H)L = 



(l-Ai)fll (1-A0.32 0-ki)aj 
(\-h 2 )bi 

(1— A 3 )* 2 



(1-Al)a„_l (1-Ai)a„ 


















(l-A H )A H _l 







Thus, we see that (/ _ H)L is a matrix with the same mathematical form as a Leslie matrix. In Section 1 1.18 we showed that a 
necessary and sufficient condition for a Leslie matrix to have 1 as an eigenvalue is that its net reproduction rate also be 1 [see Eq. 
16 of Section 11.18]. Calculating the net reproduction rate of (/ _ H)L and setting it equal to 1, we obtain (verify) 

(1-Al)[fli I a 2 b { (\-h 2 ) I a 3 b l b 2 0-k 2 )0-k 3 ) I ■■■ 



I a n b x b T -b n -x{\ -h 2 ){\ -hi)-{\ -h»)] = 1 



(4) 



This equation places a restriction on the allowable harvesting fractions. Only those values of ^, ^ • • •> h n that satisfy 4 and that 
lie in the interval [0, 1] can produce a sustainable yield. 

If h\9 A 2' - •> A M d° sat i s fy 4. then the matrix (J — H)L has the desired eigenvalue X\ = 1- Furthermore, this eigenvalue has 
multiplicity 1, because the positive eigenvalue of a Leslie matrix always has multiplicity 1 (Theorem 11.18.1). This means that 
there is only one linearly independent eigenvectors satisfying Equation 2. [See Exercise 3(b) of Section 11.18.] One possible 
choice for x is the following normalized eigenvector: 

1 



xi = 



Al(l-A 2 ) 

b { b 2 b 3 (\-h2)(l-h 3 )(\-h4) 
AlA 2 A 3 -*«-lCl -*2)C1 - A 3 )-C1 - *«) 



(5) 



Any other solution x of 2 is a multiple of x\. Thus, the vector x\ determines the proportion of females within each of the n classes 
after a harvest under a sustainable harvesting policy. But there is an ambiguity in the total number of females in the population 
after each harvest. This can be determined by some auxiliary condition, such as an ecological or economic constraint. For example, 
for a population economically supported by the harvester, the largest population the harvester can afford to raise between harvests 
would determine the particular constant that x\ is multiplied by to produce the appropriate vector x in Equation 2. For a wild 
population, the natural habitat of the population would determine how large the total population could be between harvests. 

Summarizing our results so far, we see that there is a wide choice in the values of^,^, . . . , A that will produce a sustainable 
yield. But once these values are selected, the proportional age distribution of the population after each harvest is uniquely 
determined by the normalized eigenvector x\ defined by Equation 5. We now consider a few particular harvesting strategies of this 
type. 

Uniform Harvesting 

With many populations it is difficult to distinguish or catch animals of specific ages. If animals are caught at random, we can 
reasonably assume that the same fraction of each age class is harvested. We therefore set 

h = h\ =h 2 = '" = h yi 

Equation 2 then reduces to (verify) 



Lx = 



( 1 \ 



1-A 



Hence, 1 / (1 — A) must be the unique positive eigenvalue X\ of the Leslie growth matrix L. That is, 

1 



Ai = 



\-h 



Solving for the harvesting fraction h, we obtain 



A = l-fl/AQ 



(6) 



The vector x j, in this case, is the same as the eigenvector of L corresponding to the eigenvalue ^J. From Equation 8 of Section 
11.18, this is 

1 



xi = 



b x b 2 SX 



bib 2 bi!\{ 



H-l 



(7) 



b x b T -b n - X l\[ 

From 6 we can see that the larger \^ is, the larger is the fraction of animals we can harvest without depleting the population. Note 
that we need Aj > 1 in order for the harvesting fraction h to lie in the interval (0, 1). This is to be expected, because \^ > \ is the 
condition that the population be increasing. 



EXAMPLE 1 Harvesting Sheep 



For a certain species of domestic sheep in New Zealand with a growth period of 1 year, the following Leslie matrix was found (see 
G. Caughley, "Parameters for Seasonally Breeding Populations," Ecology, 48, 1967, pp. 834-839). 
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The sheep have a lifespan of 12 years, so they are divided into 12 age classes of duration 1 year each. By the use of numerical 
techniques, the unique positive eigenvalue of L can be found to be 

Ai = 1.176 

From Equation 6, the harvesting fraction h is 

h = 1 - (1 / Ai) = 1 - (1 / 1.176) =.150 

Thus, the uniform harvesting policy is one in which 15.0% of the sheep from each of the 12 age classes is harvested every year. 
From 7 the age distribution vector of the sheep after each harvest is proportional to 

1.000 
0.719 
0.596 
0.489 
0.395 
0.311 
0.237 
0.171 
0.114 
0.067 
0.032 
0.010 



*l = 



(8) 



From 8 we see that for every 1000 sheep between and 1 year of age that are not harvested, there are 719 sheep between 1 and 2 
years of age, 596 sheep between 2 and 3 years of age, and so forth. 



Harvesting Only the Youngest Age Class 

In some populations only the youngest females are of any economic value, so the harvester seeks to harvest only the females from 
the youngest age class. Accordingly, let us set 

h\=h 

&2 — ^3 — "■ — ^h — 
Equation 4 then reduces to 

(1 -h)(a\ -\-a 2 b\ -\-a^b\b2 4- -H-fl„iii2*"*H-l) = 1 
or 

(\-h)R=\ 
where R is the net reproduction rate of the population. [See Equation 17 of Section 11.18.] Solving for h, we obtain 



! = !-(!/£) 



(9) 



Note from this equation that a sustainable harvesting policy is possible only if R > ]. This is reasonable because only if ^ > ] is the 
population increasing. From Equation 5, the age distribution vector after each harvest is proportional to the vector 

1 

h 
b\h 

b\hh 

b\hh'"bn-l 



xi = 



(10) 



EXAMPLE 2 Sustainable Harvesting Policy 



Let us apply this type of sustainable harvesting policy to the sheep population in Example 1 . For the net reproduction rate of the 
population we find 

R = a\ +^2^1 -f ^33^1^2 "I 1" a nb\b2"'byi-\ 

= (.000) + (.045)(.845) +»■ 4- (.421)(.S45)(.975)-(.370) 
= 2.514 
From Equation 9, the fraction of the first age class harvested is 

k = 1 - (1 / R) = 1 - (1 / 2.514) = .602 

From Equation 10, the age distribution of the sheep population after the harvest is proportional to the vector 



xi = 



1.000 

.845 

(.845) (.975) 

(.845)(.975)(.965) 



(.845)(.975>--(.370) 



A direct calculation gives us the following (see also Exercise 3): 





1.000 




0.845 




0.324 




0.795 




0.755 




0.699 




0.626 




0.532 




0.418 




0.289 




0.162 




0.060 



(11) 



Zxi = 



2.514 


0.845 


0.824 


0.795 


0.755 


0.699 


0.626 


0.532 


0.413 


0.289 


0.162 


0.060 



(12) 



The vector Lx\ is the age distribution vector immediately before the harvest. The total of all entries in Lx\ is 8.520, so the first 
entry 2.514 is 29.5% of the total. This means that immediately before each harvest, 29.5% of the population is in the youngest age 
class. Since 60.2% of this class is harvested, it follows that 17.8% (= 60.2% of 29.5%) of the entire sheep population is harvested 
each year. This can be compared with the uniform harvesting policy of Example 1, in which 15.0% of the sheep population is 
harvested each year. 

Optimal Sustainable Yield 

We saw in Example 1 that a sustainable harvesting policy in which the same fraction of each age class is harvested produces a 
yield of 15.0% of the sheep population. In Example 2 we saw that if only the youngest age class is harvested, the resulting yield is 
17.8% of the population. There are many other possible sustainable harvesting policies, and each generally provides a different 
yield. It would be of interest to find a sustainable harvesting policy that produces the largest possible yield. Such a policy is called 
an optimal sustainable harvesting policy , and the resulting yield is called the optimal sustainable yield. However, determining the 
optimal sustainable yield requires linear programming theory, which we will not discuss here. We refer the reader to the following 
result, which appears in J. R. Beddington and D. B. Taylor, "Optimum Age Specific Harvesting of a Population," Biometrics, 29, 
1973, pp. 801-809. 



THEOREM 11.19.1 



Optimal Sustainable Yield 

An optimal sustainable harvesting policy is one in which either one or two age classes are harvested. If two age classes are 
harvested, then the older age class is completely harvested. 



As an illustration, it can be shown that the optimal sustainable yield of the sheep population is attained when 



(13) 



Al = 0.522 
Ag = 1.000 

and all other values of ^ . are zero. Thus, 52.2% of the sheep between and 1 year of age and all the sheep between 8 and 9 years 
of age are harvested. As we ask the reader to show in Exercise 2, the resulting optimal sustainable yield is 19.9% of the population. 



Exercise Set 11.19 



O 



Click here for Just Ask! 



Let a certain animal population be divided into three 1-year age classes and have as its Leslie matrix 



L = 



"o 


4 


3" 


1 








2 









1 
4 






(a) Find the yield and the age distribution vector after each harvest if the same fraction of each of the three age classes is 
harvested every year. 

(b) Find the yield and the age distribution vector after each harvest if only the youngest age class is harvested every year. 
Also, find the fraction of the youngest age class that is harvested. 



For the optimal sustainable harvesting policy described by Equations 13, find the vector x\ that specifies the age distribution of 
2. the population after each harvest. Also calculate the vector £ Xl and verify that the optimal sustainable yield is 19.9% of the 
population. 



3. 



Use Equation 10 to show that if only the first age class of an animal population is harvested, 



Lxi-xi = 










where R is the net reproduction rate of the population. 

If only the 7th class of an animal population is to be periodically harvested (/ = \ ? 2, ..., a), find the corresponding harvesting 
4" fraction ^j. 

Suppose that all of the Jth class and a certain fraction k j of the 7th class of an animal population is to be periodically harvested 
5 * ( 1 < / < J < ») • Calculate kj. 



Section 11.19 



a Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



The results of Theorem 1 1.19.1 suggest the following algorithm for determining the optimal sustainable yield. 

(i) For each value of j = 1, 2,. . ., n, set ft . = ft an d ft^ = f° r k ^ i and calculate the respective yields. These n 

calculations give the one-age-class results. Of course, any calculation leading to a value of h not between and 1 is 
rejected. 



(ii) For each value of j = 1, 2, ...,« — 1 and j = i-\- \,j \ 2, . . ., n, set ft 2 - = ft, h< = 1, and fo k — for ft # jj and calculate 
the respective yields. These ^-«(« — 1) calculations £ 
to a value of h not between and 1 is again rejected. 



the respective yields. These ^-«(« — 1) calculations give the two-age-class results. Of course, any calculation leading 



(iii) Of the yields calculated in parts (i) and (ii), the largest is the optimal sustainable yield. Note that there will be at most 

calculations in all. Once again, some of these may lead to a value of h not between and 1 
and must therefore be rejected. 

If we use this algorithm for the sheep example in the text, there will be at most ■^■(12)(12 + 1)=78 calculations to consider. 
Use a computer to do the two-age-class calculations for ftj — ft, ft ■ = 1, and ft^ — q for ft ^ ] or j for j = 2, 3, . . ., 12. Construct 
a summary table consisting of the values of ft ^ and the percentage yields using jf = 2, 3, . . ., 12, which will show that the 
largest of these yields occurs when j = 9. 

Using the algorithm in Exercise Tl, do the one-age-class calculations for ft . — ft and ft ft — Q for ft ^ j for j — ], 2, . . ., 12. 
T2. Construct a summary table consisting of the values of ft . and the percentage yields using j — ], 2, . . ., 12, which will show that 
the largest of these yields occurs when j — 9. 

Referring to the mouse population in Exercise T3 of Section 11.18, suppose that reducing the birthrates is not practical, so 
T3. you instead decide to control the population by uniformly harvesting all of the age classes monthly. 

(a) What fraction of the population must be harvested monthly to bring the mouse population to equilibrium eventually? 

(b) What is the equilibrium age distribution vector under this uniform harvesting policy? 

(c) The total number of mice in the original mouse population was 155. What would be the total number of mice after 5, 
10, and 200 months under your uniform harvesting policy? 
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11.20 

A LEAST SQUARES 
MODEL FOR HUMAN 
HEARING 



In this section we apply the method of least squares approximation to a model 
for human hearing. The use of this method is motivated by energy 
considerations. 



Prerequisites: Inner Product Spaces 
Orthogonal Projection 
Fourier Series (Section 9.4) 



Anatomy of the Ear 

We begin with a brief discussion of the nature of sound and human hearing. Figure 1 1.20.1 is a schematic diagram of the ear 
showing its three main components: the outer ear, middle ear, and inner ear. Sound waves enter the outer ear where they are 
channeled to the eardrum, causing it to vibrate. Three tiny bones in the middle ear mechanically link the eardrum with the 
snail-shaped cochlea within the inner ear. These bones pass on the vibrations of the eardrum to a fluid within the cochlea. The 
cochlea contains thousands of minute hairs that oscillate with the fluid. Those near the entrance of the cochlea are stimulated by 
high frequencies, and those near the tip are stimulated by low frequencies. The movements of these hairs activate nerve cells that 
send signals along various neural pathways to the brain, where the signals are interpreted as sound. 



{ UL-lliUL; 




Figure 11.20.1 



The sound waves themselves are variations in time of the air pressure. For the auditory system, the most elementary type of sound 
wave is a sinusoidal variation in the air pressure. This type of sound wave stimulates the hairs within the cochlea in such a way that 
a nerve impulse along a single neural pathway is produced (Figure 1 1.20.2). A sinusoidal sound wave can be described by a 
function of time 



g(t) =^4n I Asm(LJt-S) 



(1) 



where q(t) is the atmospheric pressure at the eardrum, j[^ is the normal atmospheric pressure, A is the maximum deviation of the 
pressure from the normal atmospheric pressure, w / 2?r is the frequency of the wave in cycles per second, and S is the phase angle of 
the wave. To be perceived as sound, such sinusoidal waves must have frequencies within a certain range. For humans this range is 
roughly 20 (cps) to 20,000 cps. Frequencies outside this range will not stimulate the hairs within the cochlea enough to produce 
nerve signals. 
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Figure 11.20.2 



To a reasonable degree of accuracy, the ear is a linear system. This means that if a complex sound wave is a finite sum of 
sinusoidal components of different amplitudes, frequencies, and phase angles, say, 



(2) 



then the response of the ear consists of nerve impulses along the same neural pathways that would be stimulated by the individual 
components (Figure 11.20.3). 
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Figure 11.20.3 

Let us now consider some periodic sound wave p(t) with period T [i.e., p(t) = p(t \ T)] that is not a finite sum of sinusoidal 
waves. If we examine the response of the ear to such a periodic wave, we find that it is the same as the response to some wave that 
is the sum of sinusoidal waves. That is, there is some sound wave q(t) as given by Equation 2 that produces the same response as 
p(t), even though p(t) and q(t) are different functions of time. 

We now want to determine the frequencies, amplitudes, and phase angles of the sinusoidal components of q(£)- Because q(t) 
produces the same response as the periodic wave p(£), it is reasonable to expect that q(t) has the same period 7as p{i). This 
requires that each sinusoidal term in q(t) have period T. Consequently, the frequencies of the sinusoidal components must be 
integer multiples of the basic frequency 1 / 7 of the function p(£). Thus, the uj. in Equation 2 must be of the form 

But because the ear cannot perceive sinusoidal waves with frequencies greater than 20,000 cps, we may omit those values of k for 
which w £ / 2k = k / T is greater than 20,000. Thus, q (£) is of the form 

2mr£ 

(3) 



? (0=^0 + ^1 sin f-^-Jij + ■» + ^3111 f- 
where n is the largest integer such that « / T is not greater than 20,000. 



"j^-^H 



We now turn our attention to the values of the amplitudes ^g, j^,. . ., j4 h and the phase angles S\, d~2> • • •» S n that appear in Equation 
3. There is some criterion by which the auditory system "picks" these values so that q(t) produces the same response as p{i). To 
examine this criterion, let us set 

e{t)=p{i)-q{i) 
If we consider q(t) as an approximation to p (t ) , then e (t) is the error in this approximation, an error that the ear cannot perceive. 



In terms of s (|), the criterion for the determination of the amplitudes and the phase angles is that the quantity 



/ [e(t)] 2 dt=f [p(t)-q(t)} di 
JO JO 



(4) 



be as small as possible. We cannot go into the physiological reasons for this, but we note that this expression is proportional to the 
acoustic energy of the error wave e (t) over one period. In other words, it is the energy of the difference between the two sound 
waves p(t) and q(t) that determines whether the ear perceives any difference between them. If this energy is as small as possible, 
then the two waves produce the same sensation of sound. Mathematically, the function q(t) in 4 is the least squares approximation 
to p(t) from the vector space C[0, T] of continuous functions on the interval [0, T] . (See Section 9.4.) 

Least squares approximations by continuous functions arise in a wide variety of engineering and scientific approximation 
problems. Apart from the acoustics problem just discussed, some other examples follow. 



1 Let S( K ) be the axial strain distribution in a uniform rod lying along the x-axis from K = to K = I (Figure 1 1.20.4). The 
strain energy in the rod is proportional to the integral 



j[W 



)] 2 dx 



The closeness of an approximation q ( x ) to S(x) can be judged according to the strain energy of the 
difference of the two strain distributions. That energy is proportional to 



J [S(x)-q(x 



)] 2 dx 



which is a least squares criterion. 
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Figure 11.20.4 



2. Let E(t) be a periodic voltage across a resistor in an electrical circuit (Figure 1 1.20.5). The electrical energy transferred to 
the resistor during one period T is proportional to 



f 

7o 



[E{t)} A dt 



If q{t) has the same period as B{t) and is to be an approximation to E(t), then the criterion of 
closeness might be taken as the energy of the difference voltage. This is proportional to 

2 



r 



[E(t)-q(t)] dt 



which is again a least squares criterion. 




Figure 11.20.5 

3. Let y {x) be the vertical displacement of a uniform flexible string whose equilibrium position is along the x-axis from x = Q 
to x = 1 (Figure 1 1.20.6). The elastic potential energy of the string is proportional to 






2 

)] dx 



If q{x) is to be an approximation to the displacement, then as before, the energy integral 

JO 

determines a least squares criterion for the closeness of the approximation. 




Figure 11.20.6 

Least squares approximation is also used in situations where there is no a priori justification for its use, such as for approximating 
business cycles, population growth curves, sales curves, and so forth. It is used in these cases because of its mathematical 
simplicity. In general, if no other error criterion is immediately apparent for an approximation problem, the least squares criterion 
is the one most often chosen. 

The following result was obtained in Section 9.4. 



THEOREM 11.20.1 



Minimizing Mean Square Error on [0. 2,t ] 

If f(t) is continuous on [0, 2ir], then the trigonometric function g(^) of the form 



g(t) = —a\}-\- a i cost -\ hfl H C0S ^ I iisin^H \-b n smtt£ 



that minimizes the mean square error 



I l 



has coefficients 



&k 



bk 



= 11 



[f(t)-g(t)Vdt 



f{t) cos ktdt 9 k = §,\, 2,...,n 



f (t) sin kt dt 9 k = 1 , 2, . . ., n 



If the original function / (i) is defined over the interval [Q ; T] instead of [Q, 2k] , a change of scale will yield the following result 
(see Exercise 8): 



THEOREM 11.20.2 



Minimizing Mean Square Error on 


[0,71 


If /(£) is continuous on [0, T], then the trigonometric function g(t) of the form 


g{i) = ifl + -si cos^ + - + a n cos-^fl +b { sin^ + - -f i H sin-^* 


that minimizes the mean square error 


jf [/CO-gCO] 2 ^ 


has coefficients 




fl ft = f/ /COcos-^jfr, A = 0,1,2 * 






7"" 

** = f / fiOsm^dt, £=1,2 » 





EXAMPLE 1 Least Squares Approximation to a Sound Wave 



Let a sound wave p(jfy have a saw-tooth pattern with a basic frequency of 5000 cps (Figure 1 1.20.7). Assume units are chosen so 
that the normal atmospheric pressure is at the zero level and the maximum amplitude of the wave is A. The basic period of the 
wave is T= 1 / 5000 = .0002 second. From t — Q to t = 7, the function p(t) has the equation 



'«-¥(!-<) 



Theorem 1 1.20.2 then yields the following (verify): 



*°=H^=!i¥(f-'>"= 



2£tt£ 






£=1,2, 



£=1,2,. 



We can now investigate how the sound wave ^?(f) is perceived by the human ear. We note that 4 / T= 20,000 cps, so we need only 
go up to fc = 4 in the formulas above. The least squares approximation to p(t) is then 



?(0 = # 



sm^t I l E in^ + l S in^ + l S in^ 



The four sinusoidal terms have frequencies of 5000, 10,000, 15,000, and 20,000 cps, respectively. In Figure 1 1.20.8 we have 
plotted p(i) and q(t) over one period. Although #(2) is not a very good point-by-point approximation to p(i), to the ear, both ^(^) 
and ^(£) produce the same sensation of sound. 
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Figure 11.20.7 




As discussed in Section 9.4, the least squares approximation becomes better as the number of terms in the approximating 
trigonometric polynomial becomes larger. More precisely, 

-.2 



a 



i 



V 






dte 



tends to zero as n approaches infinity. We denote this by writing 

1 °° 

where the right side of this equation is the Fourier series of f (£). Whether the Fourier series of f (|) converges to f (t) for each t 
is another question, and a more difficult one. For most continuous functions encountered in applications, the Fourier series does 
indeed converge to its corresponding function for each value of t. 



Exercise Set 1 1 .20 



@ 



Click here for Just Ask! 



-^ 



1. 



Find the trigonometric polynomial of order 3 that is the least squares approximation to the function f (£) = (t — x) over the 
interval [0, 2ir]. 



2. 



Find the trigonometric polynomial of order 4 that is the least squares approximation to the function f(t)=t over the interval 
[0,7]. 



Find the trigonometric polynomial of order 4 that is the least squares approximation to the function f (j) over the interval 
3 - [0,2s-], where 



/C0 = 



sm£, 0<£<t 
0, n<t<2n 



Find the trigonometric polynomial of arbitrary order n that is the least squares approximation to the function /(£) = sin-^ over 
4. ^ 

the interval [0, 2ir]. 



5. 



Find the trigonometric polynomial of arbitrary order n that is the least squares approximation to the function f(t) over the 



interval [0, T], where 



/(*) = ( 



' 1 

7-£, )rT<i<T 



For the inner product 

(u, V J= / u(t)v(t)dt 
show that 

(a) ||1|| = ^ 

(b) \\cosk£\\ = yn i'ork= 1, 2, ... 

(c) [|anfe|| = |/ff for£=l,2,... 

Show that the 2^ I 1 functions 

7 

1, cost, cos 2t, ..., cos^, sin£ ? sin2^ ? ..., sinnt 

are orthogonal over the interval [0, 2~] relative to the inner product (u, v) defined in Exercise 6. 

If f (£) is defined and continuous on the interval [0, T] , show that f(Trf 2ir) is defined and continuous for T in the interval 
*• [ 0, 2n] . Use this fact to show how Theorem 1 1 .20.2 follows from Theorem 1 1 .20. 1 . 

Section 1 1 .20 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



Let g be the function 

3 I 4sin£ 



for < t < 2tt. Use a computer to determine the Fourier coefficients 

i ak \ .. 1 f 2 "f 3-1 4smt U cos fa 



'*} = !/ 

7 k j ^io 



5 — 4 cos t j\smkt 



for fc — 0, 1, 2, 3, 4, 5. From your results, make a conjecture about the general expressions for a^ and i^. Test your conject 
by calculating 

1 °° 



on the computer and see whether it converges to g(^). 



Let g be the function 
T2. 



cos f r 



g (t) =e COST [cos(smt) I sin ( sin 0] 
for < t < 2ir. Use a computer to determine the Fourier coefficients 

for k = 0, 1, 2, 3, 4, 5. From your results, make a conjecture about the general expressions for a^ and £ fc . Test your conjecture 
by calculating 

1 °° 

on the computer and see whether it converges to g(t). 



Copyright © 2005 John Wiley & Sons, Inc. All rights reserved. 



11.21 



Among the more interesting image-manipulation techniques available for 
computer graphics are warps and morphs. In this section we show how linear 
WARPS AND MORPHS transformations can be used to distort a single picture to produce a warp, or to 

distort and blend two pictures to produce a morph. 



Prerequisites: Geometry of Linear Operators on r 2 (Section 9.2) 
Linear Independence 
Bases in ^2 

Most computer graphics software enables you to manipulate an image in various ways, such as by scaling, rotating, or slanting the 
image. Distorting an image by moving the corners of a rectangle containing the image is another basic image-manipulation 
technique. Distorting various pieces of an image in different ways is a more complicated procedure that results in a warp of the 
picture. In addition, warping two different images in complementary ways and blending the warps results in a morph of the two 
pictures (from the Greek root meaning "shape" or "form"). The main application of warping and morphing images has been the 
production of special effects in motion pictures and television or print advertisements. However, many scientific and technological 
applications for such techniques have also arisen — for example, studying the evolution of the shapes of living organisms, 
analyzing the growth and development of living organisms, assisting in reconstructive and cosmetic surgery, investigating 
variations in the design of a product, and "aging" photographs of missing people or police suspects. 

Warps 

We begin by describing a simple warp of a triangular region in the plane. Let the three vertices of a triangle be given by the three 
noncollinear points v\, \ r 2, and V3 (Figure 1 1.21.1a). We shall call this triangle the begin-triangle. If v is any point in the 
begin-triangle, then there are unique constants c\ and ^2 suc h that 



v-v 3 =ci(vi-v 3 ) I C2O2-V3) 

Equation 1 expresses the vector v — V3 as a (unique) linear combination of the two linearly independent vectors vi — V3 and 
V2 — V3 with respect to an origin at V3. If we set c ^ = 1 — c \ — C2> ^en we can rewrite 1 as 

y = c\y\ I c 2 V2 I C3V3 

where 

c\ +^2+^3= 1 



(1) 



(2) 



(3) 



from the definition of cy We say that v is a convex combination of the vectors vi, V2> and V3 if 2 and 3 are satisfied and, in 
addition, the coefficients c\, C2> and c% are nonnegative. It can be shown (Exercise 6) that v lies in the triangle determined by vi, \ r 2 
, and V3 if and only if it is a convex combination of those three vectors. 




f--FiVl+.Ca*j:-fC^ 




Figure 11.21.1 



Next, given three noncollinear points wi, W2, and W3 of an end-triangle (Figure 11.21. lb), there is a unique affine transformation 
that maps vi to ^1, V2 to W2, and V3 to W3. That is, there is a unique 2x2 invertible matrix M and a unique vector b such that 



w i = Mv i -\-h for; = 1,2, 3 



(4) 



(See Exercise 5 for the evaluation of M and b.) Moreover, it can be shown (Exercise 3) that the image w of the vector v in 2 under 
this affine transformation is 



w = ciwi+C2W2 I C 3 W 3 



(5) 



This is a basic property of affine transformations: They map a convex combination of vectors to the same convex combination of 
the images of the vectors. 

Now suppose that the begin-triangle contains a picture within it (Figure 1 1.21.2a). That is, to each point in the begin-triangle we 
assign a gray level, say for white and 100 for black, with any other gray level lying between and 100. In particular, let a 
scalar- valued function pq, called the picture-density of the begin-triangle, be defined so that pg(v) is the gray level at the point v in 
the begin-triangle. We can now define a picture in the end-triangle, called a warp of the original picture, with a picture-density p\ 
by defining the gray level at the point w within the end-triangle to be the gray level of the point v in the begin-triangle that maps 
onto w. In equation form, the picture-density p\ is determined by 



Pl(w) =p[)0lVl "h C 2 Y2 + C 3 Y 3 ) 

In this way, as c\ 9 cj, and c 3 vary over all nonnegative values that add to one, 5 generates all points w in the end-triangle, and 6 
generates the gray levels p 1 ( w ) of the warped picture at those points (Figure 1 1.21.2/?). 



(6) 








m 

Figure 11.21.2 

Equation 6 determines a very simple warp of a picture within a single triangle. More generally, we can break up a picture into 
many triangular regions and warp each triangular region differently. This gives us much freedom in designing a warp through our 
choice of triangular regions and how we change them. To this end, suppose we are given a picture contained within some 
rectangular region of the plane. We choose n points v\, \ r 2, • • ., v H within the rectangle, which we call vertex points, so that they 
fall on key elements or features of the picture we wish to warp (Figure 1 1. 21. 3a). Once the vertex points are chosen, we complete 
a triangulation of the rectangular region; that is, we draw lines between the vertex points in such a way that we have the following 
conditions (Figure ll.21.3fe): 

1. The lines form the sides of a set of triangles. 



2. The lines do not intersect. 



3. Each vertex point is the vertex of at least one triangle. 



4. The union of the triangles is the rectangle. 



5. The set of triangles is maximal (that is, no more vertices can be connected). 
Note that condition 4 requires that each corner of the rectangle containing the picture be a vertex point. 




la) 




Figure 11.21.3 

One can always form a triangulation from any n vertex points, but the triangulation is not necessarily unique. For example, Figures 
11213b and 11.21.3c are two different triangulations of the set of vertex points in Figure 11. 21. 3a. Since there are various 
computer algorithms that perform triangulations very quickly, it is not necessary to perform the tiresome triangulation task by 
hand; one need only specify the desired vertex points and let a computer generate a triangulation from them. If n is the number of 
vertex points chosen, it can be shown that the number of triangles m of any triangulation of those points is given by 



m = 2tf — 2 — k 
where k is the number of vertex points lying on the boundary of the rectangle, including the four situated at the corner points. 



(7) 



The warp is specified by moving the n vertex points v\, V2> • •> v h t0 new locations yq, u- 2 , . . ., w H according to the changes we 
desire in the picture (Figures 11.21 Aa and 11.21 Ab). However, we impose two restrictions on the movements of the vertex points: 

1. The four vertex points at the corners of the rectangle are to remain fixed, and any vertex point on a side of the rectangle is to 
remain fixed or move to another point on the same side of the rectangle. All other vertex points are to remain in the interior 
of the rectangle. 



2. The triangles determined by the triangulation are not to overlap after their vertices have been moved. 



The first restriction guarantees that the rectangular shape of the begin-picture is preserved. The second restriction guarantees that 
the displaced vertex points still form a triangulation of the rectangle and that the new triangulation is similar to the original one. 
For example, Figure 1 1.21.4c is not an allowable movement of the vertex points shown in Figure 1 1.21.4a. Although a violation of 



this condition can be handled mathematically without too much additional effort, the resulting warps usually produce unnatural 
results and we shall not consider them here. 




Figure 1 1.21.5 is a warp of a photograph of a woman using a triangulation with 94 vertex points and 179 triangles. Note that the 
vertex points in the begin-triangulation are chosen to lie along key features of the picture (hairline, eyes, lips, etc.). These vertex 
points were moved to final positions corresponding to those same features in a picture of the woman taken 20 years after the 
begin-picture. Thus, the warped picture represents the woman forced into her older shape but using her younger gray levels. 
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Figure 11.21.5 
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Time-Varying Warps 

A time-varying warp is the set of warps generated when the vertex points of the begin-picture are moved continually in time from 
their original positions to specified final positions. This gives us a motion picture in which the begin-picture is continually warped 
to a final warp. Let us choose time units so that t = Q corresponds to our begin-picture and t = ] corresponds to our final warp. The 
simplest way of moving the vertex points from time to time 1 is with constant velocity along straight-line paths from their initial 
positions to their final positions. 

To describe such a motion, let u 2 (l) denote the position of the /th vertex point at any time t between and 1. Thus u 2 (0) = v 2 - (its 
given position in the begin-picture) and u 2 (l) = w 2 (its given position in the final warp). In between, we determine its position by 



Uj(0 = (l-Ov* +*w 2 



(8) 



Note that 8 expresses u 2 (£) as a convex combination of v 2 and w 2 - for each t in [0, 1]. Figure 1 1.21.6 illustrates a time- varying 
triangulation of a plain rectangular region with six vertex points. The lines connecting the vertex points at the different times are 
the space-time paths of these vertex points in this space-time diagram. 




u,m 
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Figure 11.21.6 



Once the positions of the vertex points are computed at time t, a warp is performed between the begin-picture and the triangulation 
at time t determined by the displaced vertex points at that time. Figure 1 1.21.7 shows a time- varying warp at five values of t 
generated from the warp between t — Q and t = ] shown in Figure 1 1.21.5. 
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Figure 11.21.7 
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Morphs 

A time-varying morph can be described as a blending of two time- varying warps of two different pictures using two triangulations 
that match corresponding features in the two pictures. One of the two pictures is designated as the begin-picture and the other as 
the end-picture. First, a time- varying warp from t — Q to t = ] is generated in which the begin-picture is warped into the shape of 
the end-picture. Then a time- varying warp from t — ] to t = Q is generated in which the end-picture is warped into the shape of the 



begin-picture. Finally, a weighted average of the gray levels of the two warps at each time t is produced to generate the morph of 
the two images at time t. 

Figure 1 1.21.8 shows two photographs of a woman taken 20 years apart. Below the pictures are two corresponding triangulations 
in which corresponding features of the two photographs are matched. The time- varying morph between these two pictures for five 
values of t between and 1 is shown in Figure 1 1.21.9. 
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Figure 11.21.8 
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The procedure for producing such a morph is outlined in the following nine steps (Figure 1 1.21.10): 



Step 1. Given a begin-picture with picture-density pg and an end-picture with picture-density p\, position n vertex points vi, 
V2, • • ., v H in the begin-picture at key features of that picture. 
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Figure 11.21.10 

Step 2. Position n corresponding vertex points w\, W2, . . ., w H in the end-picture at the corresponding key features of that 
picture. 

Step 3. Triangulate the begin- and end-pictures in similar ways by drawing lines between corresponding vertex points in both 
pictures. 

Step 4. For any time t between and 1, find the vertex points Ul (f), 112CO' * * *' u n(0 m ^ e mor ph picture at that time, using 
the formula 



iij(£) = (1 — t)vj I ftvj, i = l,2, ..., « 



(9) 



Step 5. Triangulate the morph picture at time t similar to the begin- and end-picture triangulations. 

Step 6. For any point u in the morph picture at time t, find the triangle in the triangulation of the morph picture in which it lies 
and the vertices uj(£) » u j(£), and u.g(£) of that triangle. (See Exercise 1 to determine whether a given point lies in a given 
triangle.) 

Step 7. Express u as a convex combination of uj(i), uj(£)> an d ux(f ) by finding the constants cj, cj 9 and eg such that 



and 



u = cpi I (£)+cjxij(£) I Cjfaj^CO 



cj + cj+cj^^ 1 



(10) 



(11) 



Step 5. Determine the locations of the point u in the begin- and end-pictures using 



v = c jv j + cjyj I cgyg (in the begin-picture) 



(12) 



and 

w = c/wj I cjwj + CKvrg (in the end-picture) 

Step 9. Finally, determine the picture-density p t (\i) of the morph-picture at the point u using 

p f (u) = (l-OW)(Y)+^i(w) 



(13) 



(14) 



Step 9 is the key step in distinguishing a warp from a morph. Equation 14 takes weighted averages of the gray levels of the begin- 
and end-pictures to produce the gray levels of the morph-picture. The weights depend on the fraction of the distances that the 
vertex points have moved from their beginning positions to their ending positions. For example, if the vertex points have moved 
one-fourth of the way to their destinations (that is, if t = Q. 25). then we use one-fourth of the gray levels of the end-picture and 
three-fourths of the gray levels of the begin-picture. Thus, as time progresses, not only does the shape of the begin-picture 
gradually change into the shape of the end-picture (as in a warp) but the gray levels of the begin-picture also gradually change into 
the gray levels of the end-picture. 

The procedure described above to generate a morph is cumbersome to perform by hand, but it is the kind of dull, repetitive 
procedure at which computers excel. A successful morph demands good preparation and requires more artistic ability than 
mathematical ability. (The software designer is required to have the mathematical ability.) The two photographs to be morphed 
should be carefully chosen so that they have matching features, and the vertex points in the two photographs also should be 
carefully chosen so that the triangles in the two resulting triangulations contain similar features of the two pictures. When the 
procedure is done correctly, each frame of the morph should look just as "real" as the begin- and end-pictures. 

The techniques we have discussed in this section can be generalized in numerous ways to produce much more elaborate warps and 
morphs. For example: 

1. If the pictures are in color, the three components of the picture colors (red, green, and blue) can be morphed separately to 
produce a color morph. 

2. Rather than following straight-line paths to their destinations, the vertices of a triangulation can be directed separately along 
more complicated paths to produce a variety of results. 

3. Rather than travel with constant speeds along their paths, the vertices of a triangulation can be directed to have different 
speeds at different times. For example, in a morph between two faces, the hairline can be made to change first, then the 
nose, and so forth. 

4. Similarly, the gray-level mixing of the begin-picture and end-picture at different times and different vertices can be varied in 
a more complicated way than that in Equation 14. 

5. One can morph two surfaces in three-dimensional space (representing two complete heads, for example) by triangulating the 
surfaces and using the techniques in this section. 

6. One can morph two solids in three-dimensional space (for example, two three-dimensional tomographs of a beating human 
heart at two different times) by dividing the two solids into corresponding tetrahedral regions. 

7. Two film strips can be morphed frame by frame by different amounts between each pair of frames to produce a morphed 
film strip in which, say, an actor walking along a set is gradually morphed into an ape walking along the set. 



8. Instead of using straight lines to triangulate two pictures to be morphed, more complicated curves, such as spline curves, can 
be matched between the two pictures. 

9. Three or more pictures can be morphed together by generalizing the formulas given in this section. 

These and other generalizations have made warping and morphing two of the most active areas in computer graphics. 



Exercise Set 1 1 .21 



&■ 



Click here for Just Ask! 



Determine whether the vector v is a convex combination of the vectors v\ 9 V2» and ¥3. Do this by solving Equations 1 and 3 for 
!• c\, C2> and C3 and ascertaining whether these coefficients are nonnegative. 



(a) 



v = 



(b) 



v = 



vi = 


"f 
_1_ 


»V2 = 


"3" 
5 


, v 3 = 


"4" 


vi = 


"f 
_1_ 


>V2 = 


"3" 
5_ 


, v 3 = 


"4" 
_2_ 



(c) 



v = 



[0] 




[3] 




r — 21 




[3] 


_0_ 


, vi = 


_3_ 


»V2 = 


_-2_ 


, v 3 = 


_0_ 



(d) 



v = 



r 1 1 




\ 3 ] 




[-2] 




[3] 


_0_ 


,vi = 


_3_ 


'V2 = 


_-2_ 


,v 3 = 


_0_ 



Verify Equation 7 for the two triangulations given in Figure 1 1.21.3. 



Let an affine transformation be given by a 2 x 2 matrix M and a two-dimensional vector b. Let v = c\y\ + €2^2 + C 3 V 3> where 

3 - ci +C2 + C3 = l;let w = Mv I b; and let W] - = Afv 2 - I b fc> r i = 1, 2, 3. Show that w = ciwi I ^"2 I C 3 W 3- (This shows that 

an affine transformation maps a convex combination of vectors to the same convex combination of the images of the vectors.) 



(a) Exhibit a triangulation of the points in Figure 1 1.21.3 in which the points ¥3, ¥5, and v^ form the vertices of a single 
triangle. 

(b) Exhibit a triangulation of the points in Figure 1 1.21.3 in which the points V2, vj, and ¥7 do n<9£ form the vertices of a 
single triangle. 



Find the 2 x 2 matrix M and two-dimensional vector b that define the affine transformation that maps the three vectors vi, ¥2, 
5. and V3 to the three vectors ^i, W2, and W3. Do this by setting up a system of six linear equations for the four entries of the 



matrix M and the two entries of the vector b. 



(a) 



(b) 



VI = 


"1" 
_1_ 


, V 


2 = 


"2" 
3 


. v 


3 = 


"2" 
1 


. w 


a = 


"4" 
_3_ 


, w 2 = 




"9" 
_5_ 


, w 3 = 




"5" 
_3_ 






vi = 


~ 


2" 


'▼2 


= 


"0" 
_0_ 


'▼3 




"2" 
1_ 


,wi 


= 


"-8" 

1 


, «r 2 = 




"0" 
_1_ 


, wr 3 = 




"5" 
_4_ 



(c) 



vi = 



"-2" 


' v 2 = 


"3" 


' v 3 = 


"l" 


, K-J = 





' w"2 = 


"5" 


. W3 = 


3 


[ 1 




w 




|_l)J 




[-2J 




[2J 




[-3J 



(d) 



vi = 



"0" 

_2_ 


>V2 = 


"2" 
_2_ 


' V3 = 


"-4" 
-2 


, wi = 


5 
2 


, W2 = 


"7" 
2 
3_ 


, w 3 = 


7" 
2 
_ -9_ 



(a) Let « and b be linearly independent vectors in the plane. Show that if c\ and ^2 are nonnegative numbers such that 
ci 4 ^2 = 1» then the vector c\n \ C2h lies on the line segment connecting the tips of the vectors a and b. 

(b) Let a and b be linearly independent vectors in the plane. Show that if c\ and ^2 are nonnegative numbers such that 
c 1 4. c 2 < 1 » then the vector c 1 a | ^2li lies in the triangle connecting the origin and the tips of the vectors a and b. 

Hint First examine the vector ^a I c^k multiplied by the scale factor \ i {c\ I ^2)- 

(c) Let vi, V2, and V3 be noncollinear points in the plane. Show that if c\, ^2* an & c 3 are nonnegative numbers such that 
^1+^2 + ^3 = 1* then the vector c\v\ + ^2 V 2 + C 3 V 3 li es i n the triangle connecting the tips of the three vectors. 

Hint Let a = vi — V3 and b — V2 _ V3 , and then use Equation 1 and part (b) of this exercise. 



(a) What can you say about the coefficients c\ 9 C2, an d ^3 that determine a convex combination v = c\v\ + C2V2 + C3V3 if v 
lies on one of the three vertices of the triangle determined by the three vectors vi, \ r 2, and V3? 

(b) What can you say about the coefficients c\ 9 C2, an d ^3 that determine a convex combination v = c\y\ + C2V2 -4- C3V3 if v 
lies on one of the three sides of the triangle determined by the three vectors vi, V2, and V3? 

(c) What can you say about the coefficients c\,C2, and c% that determine a convex combination v = c\y\ + C2V2 4- C3V3 if v 
lies in the interior of the triangle determined by the three vectors vi, V2, and V3? 



(a) The centroid of a triangle lies on the line segment connecting any one of the three vertices of the triangle with the 

midpoint of the opposite side. Its location on this line segment is two-thirds of the distance from the vertex. If the three 
vertices are given by the vectors vi, V2, and V3, write the centroid as a convex combination of these three vectors. 



(b) 



Use your result in part (a) to find the vector defining the centroid of the triangle with the three vertices 

1 
1 



, and 



Section 11.21 



ffl Technology Exercises 



The following exercises are designed to be solved using a technology utility. Typically, this will be matlab, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 



Tl. 



To warp or morph a surface in ^ we must be able to triangulate the surface. Let vj = 



vn 




V21 




V3i 


V12 


'▼2 = 


^22 


, and v 3 = 


^32 


vi3 




^23 




v 33 



three noncollinear vectors on the surface. Then a vector v = 



V2 
v 3 



lies in the triangle formed by these three vectors if and oi 



if v is a convex combination of the three vectors; that is, v = c\v\ + C2 V 2 I C3V3 for some nonnegative coefficients c\ 9 C2» J 
^3 whose sum is 1. 



(a) Show that in this case, c\, q, and c^ are solutions of the following linear system: 



vn v 2 i v 3 i"| 




[vi] 




c l 






V12 v 2 2 V32 






V?. 




c '2 


— 




V13 V23 V33 


c^ 




v 3 


1 1 1 






1 



In parts (b)-(d) determine whether the vector v is a convex combination of the vectors 



vi = 



2 3 

7 , v 2 = 

-5 9 



= , and v 3 = 



2 
2 

-4 



(b) 



v = ■ 



(c) 



V= 4 



10 
9 
9 



(d) 



V= 4 



13 

-7 

50 



T2. To warp or morph a solid object in j? 3 we first partition the object into disjoint tetrahedrons. Let vi = 



"vil" 




"V2l" 


V12 


• V2 = 


V22 


vi3 




V23 



v 3 = 



V3i 




v 4 i 


^32 


, and v 4 = 


^42 


v 33 




v 43 



be four noncoplanar vectors. Then a vector y = 



V2 
v 3 



lies in the solid tetrahedron formed by these 



four vectors if and only if v is a convex combination of the three vectors; that is, v = c\v\ 4 £2 y 2 + C 3 Y 3 + C 4 V 4 f° r some 
nonnegative coefficients c\, C2, c^, and c$ whose sum is one. 



(a) Show that in this case, c\ 9 C2, ^3> an d ^4 are solutions of the following linear system: 



vii v 2 i v 3 i v 4 i 
V12 v 22 v 3 2 v 4 2 
vi3 v 2 3 V33 V43 

1111 

In parts (b)-(d) determine whether the vector v is a convex combination of the vectors 
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C2 




V2 


C3 




v 3 


yc 4 
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vi = 
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-6 


/ v 2 = 
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a v 3 = 
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3 



= 2 , and v 4 = 



-1 
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(b) 



v = 



(c) 



v = 



(d) 



v = 
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Abbreviations 

ARTs algebraic reconstruction techniques CE civil engineer cps cycles per second EE electrical engineer ME mechanical 
engineer RDA recommended daily allowance 
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Answers to Exercises 



Exercise Set 1.1 (page 6) 



1. (a), (c), (f) 



(a) X = h^ 

*2 = s X2 = r w=q 

X2 = t X2 = s x =r 

X4 = £ y=s 

z = t 



(a) 



(b) 



(c) 



(d) 



1 
3 
2 

2 1 
4 7 



1 -1 



"12 0- 


1 


1 1" 


3 1 


-1 2 


1 7 


1 


"10 1" 






10 2 






13 







(a) 



(b) 



(c) 



2x 



= 



3xi -4*2 = 

3x\ —2^3= 5 

7xi + *2 I 4x 3 = -3 

— 2*2 I *3 = 7 

7x\ I 2*2 I *3 — 3^4 = 5 



x\ I 2^2 I 4^3 



= 1 



*l = 7 

< d > x 2 = -2 

x 3 = 3 
X4= 4 



6. x — 2y = 5 

(a) 



(b) 



Let x — t; then t — 2y — 5- Solving for y yields y = ±t — ^. 



12. The lines have no common point of intersection. 

(a) 

The lines intersect in exactly one point. 
(b) 

The three lines coincide. 
(c) 



Exercise Set 1.2 (page 19) 

1. (a), (b), (c), (d), (h), (i), (j) 



>• 

(a) 


Both 


(b) 


Neither 


(c) 


Both 


(d) 


Row-echelon 


(e) 


Neither 


(f) 


Both 



x\ = — 3> *2 = 0' *3 =7 
(a) 



x\ = It 4 8> %2 = — 3£ 4- 2? ^3 = — £ — 5? ^4 = t 
(b) 



xi = 6s-3t-2,X2 = s > x 3 = -At I 7, * 4 = -5t I 8,x$ = t 
(c) 



Inconsistent 
(d) 



x\ = 3? %2 — 1' *3 — 2 
(a) 



13 14 

(b)*l=-7-7^ 2= 7-7'* 3 = ' 



(c) 



Inconsistent 
(d) 



8. Inconsistent 

(a) 



x\= — 4> *2 = 2> *3 = 7 
(b) 



x\ = 3 + 2^ X2 = t 
(c) 



(d) x = l "f'~f' y = lo ' f^-iV £ ' ? = ^ w = s 



12. (a), (c), (d) 



x\ = 0, 7:2 = 0, 7:3 = 
(a) 



*1 = — £> ^2 = — ^ — s» ^3 =4s» X4 = i 
(b) 



w=^= — ^ y = £, z = 
(c) 



14. Only the trivial solution 

(a) 



u = ls — 5t,v= —6s I 4^, w = 2s, x = 2t 
(b) 

Only the trivial solution 
(c) 



/ 1= -l,/ 2 = 0./ 3 = l,/ 4 = 2 
(a) 



(b) 



Zi = — s — & 2,2 = ff' Z3 = — 1-> Z4 = 0' Z5 = t 



19. 



"1 3" 


and 


"1 0" 


1_ 




_° ! . 



are possible answers. 



20. a = W2,,S = 7r,7 = 



23. 



If A=l,then Xl =* 2 = _-L, * 3 = s 



If A = 2, then Xl= _ i s , X2 = 0, x 3 =s 



24. x= -13/7, y = 91/54, z = -91/8 



25. fl =l,i= - 6, c = 2,^ = 10 



30. Three lines, at least two of which are distinct 

(a) 



Three identical lines 



(b) 



32. 



False 



(a) 



False 



(b) 



False 



(c) 



False 



(d) 
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1. Undefined 

(a) 



4x2 



(b) 



Undefined 



(c) 



Undefined 



(d) 



(e) 



(f> 



(g) 



5x5 



5x2 



Undefined 



5x2 



(h) 



2. a = 5,b= -3,c = 4,d = l 



(a) 



7 2 4 
3 5 7 



(b) 



(c) 



(d) 



-5 

4 -1 

-1 -1 

-5 

4 -1 

-1 -1 

Undefined 



(e) 



1 


3~ 


4 


2 


9 





4 




3 


9 


4 


4 



(f) 



-1 

1 



(g) 



(h) 



9 


1 -1" 




-13 


2 -4 







1 -6 




9 


-13 0" 


1 


2 1 


-1 


-4 - 


6 



(a) 



(b) 



12 -3 

-4 5 

4 1 

Undefined 





"42 


10E 


I 75" 


(c) 


12 


-3 21 




36 


78 63 




" 3 


45 9" 


(d) 


11 


-11 17 




7 


17 13 




" 3 


45 9" 


(e) 


11 


-11 17 




7 


17 13 




"21 


17' 




(f) 


17 


35_ 






" 


-2 11" 


(g) 


12 


1 


1 8_ 



(h) 



(i) 



12 
48 
24 

61 



6 


9" 


-20 


14 


8 


16 



(J) 



35 



(k) 



(28) 



(a) 



"67" 




"3" 




' -2 




"7" 


64 


= 6 


6 


+ 


5 


+ 7 


4 


63 









4 




9 



"41" 




"3" 




" -2" 




"7" 


21 


= -2 


6 


+ 1 


5 


+ 7 


4 


67 









4 




9 



"41" 




"3" 




"-2" 




"7" 


59 


= 4 


6 


+ 3 


5 


+ 5 


4 


57 









4 




9 



(b) 



6 




6 




-2 




4 


6 


= 3 





1 6 


1 


1 


3 


63 




7 




7 




5 



"-6" 




"6" 




"-2" 




"4" 


17 


= -2 





+ 5 


1 


1 4 


3 


41 




7 




7 




5 



" 70" 




"6" 




' -2 




"4" 


31 


= 7 





1 4 


1 


1 9 


3 


122 




7 




7 




5 



13. 



<*) A = 



(b) 



,4 = 



3 5~ 




~*l" 




7" 


1 1 


, x = 


*2 


, b = 


-1 


5 4 




*3 










1 

5 
3 



3 

9 

1 



f 




~*l" 




T 


8 

1 


, x = 


*2 
*3 


. b = 


3 



7 




*4 




2 



16. 



(a) 



(b) 



(c) 



-3 -15 -11 
21 -15 44 



-7 
2 
5 
3 



19 
IS 
25 
23 



3 


3~ 


-1 


4 


1 


5 


4 


-4 





14 



■43 
17 
35 
24 



17. 



(a) 



j4jj is a 2 x 3 matrix and 5 tl is a 2 x 2 matrix. ^4^5^ does not exist. 



21. 



(b) 



(a) 



-1 


23 - 


-10 






37 


-13 


8 






29 


23 


41 






ail 

















i322 














1333 

















«44 

















a 55 

















at 



(b) 



(c) 



(d) 



ail «i2 «13 ^14 a 1:5 «16 

^22 a 23 fl 24 a 25 fl 2i5 

(333 fl 34 fl 35 a 36 

(344 fl 45 fl 46 

a 55 a 56 





an 

(321 ^22 

(33i fl 32 fl 33 

fl 41 fl 42 fl 43 fl 44 

(3 5 i (3^2 a^3 (3 54 fljj 

flgl (3^2 ^63 fl 64 a 65 

flu ^12 

c32i fl 22 t323 

t332 £333 [334 

(343 fl 44 fl 45 

(3^4 (3^ fl 56 

(3^5 (3^6 



27. 



One; namely, ^4 = 



1 


1 0" 


1 


-1 









30. 



(a) 



(b) 



Yes; for example, 



Yes; for example, 



"0 


r 


_0 


0_ 


"1 


0" 





0_ 



32. True 

(a) 



(b) 



False; for example, A = 



1 -1 
1 -1 



True 



(c) 



True 



(d) 
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ii-^ 









1 3 " 




1 
2 






"1 


2 


-1" 


, B~ l = 


5 20 


, C~ l = 


-2 


. ^ _1 = 


2 


-5 


3 




1 1 










o i 












1 


3 










5 10 










3 



7. 



(a) 



(b) 



(c) 



(d) 



A = 



5 


1 " 


13 


13 


3 


2 


13 


13 



A = 



'2 


1 


7 




1 


3 


7 


7 



,4 = 



2 


1 


S 




1 


3 


5 


5 



,4 = 



_9_ 
13 
_2_ 
13 



13 
_6_ 
13 



9. 



(a) 



(b) 



^) = 



p(A) = 



{C) P(A) = 



"1 


1" 


_2 


-1_ 


"20 


7" 


14 


6_ 


"39 


13" 


26 


13_ 



11. 



cos0 — sin0 
sin0 cos0 



13. 



A~* = 



1 
fill 


■■ 








1 

«22 








■■ 


1 



18- C= -^"^ 



-Id/i-1 



19. 



(a) 



(b) 



1 


1 








2 


2 






1 


1 








2 


2 












1 


1 






2 


2 


-1 





1 


1 






2 


2 



1 

1 

1 








-1 
1 



20. 



(a) One example is 



(b) One example is 



"1 2 3" 




2 1 4 




3 4 5 




"0 -1 


-1" 


1 


-1 


1 1 






22. Yes 



23. 



A~ l = 



1 


1 


2 


2 


1 


1 


2 


2 


1 
2 


1 
2 



33. 



±1 





0" 





±1 











1 1 



34. If A is invertible, then jj~ is invertible. 

(a) 



True 



(b) 



Exercise Set 1.5 (page 57) 



1. (a), (c), (d), (f) 





"0 





f 


(a) 





1 







1 










"0 





f 


(b) 





1 







1 











1 


0" 


(c) 


1 




-2 1 




"1 0" 




(d) 


1 
2 1 





(a) 



(b) 



(c) 



(a) 



(b) 



(c) 



-7 


4 


2 ■ 


-1_ 


5 
39 


2 " 
13 


4 
39 


1 
13 


Not invertible 



1 3 


r 


1 


-1 


-2 2 






ji 

26 

26 


1 

_1 
3 






-3^2 
26 

A 

26 





1 
3 
1 
5 

- 






o" 








1 





i 




1 


1 


7 


7 



Not invertible 



(d) 



(e) 



3 
5 



4 
5 
3 
2 

i 1 
5 5 



4 -1 



I 
5 



1 
5 



10. 



<») E X = 



1 
5 1 



E 2 = 



(b) 



A~ l =B 2 Ei 



1 

<4 



(c) 



^4 — i^ i^ 



ii. 



(a) 



(b) 



1 


-4 


7" 


4 


5 


-3 


2 


-1 






(c) 



2 


-1 





4 

3 


5 
3 


-1 


1 


-4 


7 


"10 


9 


-6" 


4 


5 


-3 


1 


-4 


7 



14. 



"0 


1 


0] 


1 














lj 



1 0" 


"1 0] 


1 


1 


2 1 


1 lj 



1 


3 3 8" 





1 7 8 









19. Add -1 times the first row to the second row. 

(b) 

Add -1 times the first row to the third row. 

Add -1 times the second row to the first row. 
Add the second row to the third row. 

24. In general, no. Try t = ha=c=d = 0- 
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1- x\ = 3? *2 = — 1 



4. ^1 = 1**2= — 11**3 =16 



6. w = — &,x = l,y=\0,z= —7 



(a) 



*1 = — ' *2 = - y» *3 = - -J- 



(b) 



*1 



-J' *2 =3' *3 



10 
3 



(c) 



*1 = 3, X2 = 0, X2 = — 4 



11. 



(a) *1=— '*2 = 



22 TO- ] 

W X2 ~rf 



*1 = TT'*2 



(b) ^- 17 



li 
17 



13. 



(a) 



Xi = W X2 = j5 



34 
*1 = 7T'*2 = 



(b) - 11 " 15 



28 
15 



19 



13 



(c) 



* 1 = T5'* 2= 15 



(d) 



*1 = -5' ^2 = J 



15. 



(a) 



x\= —\2 — 3tiX2= —5—£>X2=£ 



(b) 



*1 =7 — 3t> XT = 3 — t' X? = t 



19. ^j = £3 4- &4' &2 = 2^3 + &4 



21. 



jst = 



11 12-3 27 26 

_6 -8 1-13-17 

-15-21 9-38-35 



22. Only the trivial solution Xl — X2 = x ^ — x ^ — §\ invertible 

(a) 



(b) 



Infinitely many solutions; not invertible 



28. I — A is invertible. 

(a) 



(b) 



x=(l-A)~'h 



30. Yes, for nonsquare matrices 
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(a) 



"l 





2 







1 




5 



Not invertible 



(b) 



(c) 



-1 





o" 





1 

2 











3 



3. 



(a) A l = 



1 
4 



(b) 



A 2 = 



1 





4 







1 




9 



A~' = 









±_ 
16 



1 



,4-* = 



,4- 2 = 



"4 





0" 





9 











16 



1 







l/(-2) J 



A~ k = 



2 k 













4' 



5. (a) 



7. a = 2, b = - 1 



10. 



(a) 



(b) 



1 







-1 










-1 




4 








J 


1 

2 











1 1 



11. 



16. 



(a) 



All 


312 


313" 


"3 





0" 


321 


322 


323 





5 





^31 


332 


333 








7 



No 



(b) 



Yes 



(b) 



17. Yes 



19. 





"4 0" 






4 


? 




4 




"-1 0" 




0-10 


? 





4 







4 










4 










-1 








4 











0" 







? 


-1 




0" 







? 


-1 







"4 0" 








0-10 


? 




4 




4 0" 




"-1 


0-1 


? 








-1 








-1 





0" 





4 











4 







0" 


-1 












-1 



20. Yes 

(a) 



(b) 



No (unless « = 1) 



Yes 



(c) 



(d) 



No (unless n = \) 



24. 



(a) x 4 ^ J 



1 
2 



(b) 



xi = — 8, x? = — 4, xt = 3 



25. 



j4 = 



1 10 
-2 



26. §(!+«) 



Supplementary Exercises (page 76) 



w=!* + ^,/=-^+f, 



3. One possible answer is 



5. x = 4, y = 2,z=3 



a*Q,b±2 
(a) 



x\ + 5^2 4- 2^4 = 



(b) 



a # 0. 6 = 2 



(c) 



a = 0, A = 2 



(d) 



a = 0, b ± 2 



K = 



2 

1 1 



11. 



(a) 



(b) 



X = 



X = 



"-1 3 


-1" 


6 


1_ 


"1 -2" 




3 1_ 





(c) 



jst = 



113 160 



37 
20 
" 37 



37 
46 
" 37 



13. mpn multiplications and m p (n — Y) additions 



15. a =\,b= -2,c = 3 



16. fl =l,i= -4,c= -5 



26. j_ _1 c_i c _2 



29. 



(b) 



a n 
b n 
d c* 



where a = ( a — c 



Exercise Set 2.1 (page 94) 



M n = 29, M n = 2h M 13 = 27, M 2 \ = - ll.M 2 2 = 13. M 2 3 = -5, M 31 = -19. it<f 3 2 = - 19, 
< a ) M 33 =19 



(b) 



Cn = 29, C12 = - 21. C13 = 27, C 2 \ = 11. C 22 = 13. C 23 = 5. C 3i = - 19. C 32 = 19, C 33 = 19 



3. 152 



4. 



(a) a dj( J 4) = 



29 


11 


-19" 


21 


13 


19 


27 


5 


19 



(b) 



A~ l = 



29 


11 


19 


152 


152 


152 


21 


13 


19 


152 


152 


152 


27 


5 


19 



152 152 



152 



6. -66 



8. k 2 -2k 2 -\§k I 95 



11. 



A~ l = 



3 -5 -5 

-3 4 5 

2 -2 -3 



13. 



A~ l = 



~1 


3 


1 


2 


2 





1 


3 
2 








1 






2 



15. 



i!"^ 



-4 3 

2 -1 

-7 

6 






-l" 








-1 


8 


1 


-7 



16. Xl = \,x 2 = 2 



18. X __M4 y 
* 55 ' y 



_61 z _46 
55' 11 



21. Cramer's rule does not apply. 



22. 



ii"^ 



cos 3 — sin0 

sin0 cos0 

1 



24. T= i,j, = o, z =2,w = 



31. det(A) = 10 x ( - 108) = - 1080 



34. One 



Exercise Set 2.2 (page 101) 



-30 



(a) 



-2 



(b) 







(c) 







(d) 



4. 30 



6. -17 



8. 39 



11. -2 



12. 


(a) 
(b) 
(c) 
(d) 


-6 

72 
-6 
18 


16. 


(a) 


det(A) = - 1 




(b) 


det(A) = 1 


18. 


x = 0, 


-4 
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det(2 J 4)= -40 = 2 2 det( J 4) 
(a) 



(b) 



det( - 2^) = - 448 = ( - 2) 3 det(,4) 



4. 


(a) 
(b) 
(c) 
(d) 


Invertible 
Not invertible 
Not invertible 
Not invertible 



6. If x = 0> the first and third rows are proportional. 
If x = 2, the first and second rows are proportional. 

12. 5 _L i/l7 
(a) * = %— 



k= -1 



(b) 



14. 



(a) 



(b) 



(c) 



A-l -2" 


~*1~ 




"0" 


-2 A-l_ 


*2 




_0_ 


A-2 -3" 


~*1~ 




"0" 


-4 A-3_ 


*2 




_0_ 


A-3 -1" 


~*r 




"0" 


5 A + 3 


*2 








A J -2A-3 = Q 
i. 



A= — 1, A = 3 



li. 



in. 



-t 
i 



A^-5A-6 = 



i. 



A= — 1, A = 6 



li. 



in. 



r _. 


~3~ 


-i 




t 


i\ 








i 



A^-4 = Q 



i. 



A = - 2, A = 2 



li. 



in. 



— t 
i 



20. No 



21. AB is singular. 



22. Flase 

(a) 





True 


(b) 






Flase 


(c) 






True 


(d) 




23. 


True 


(a) 






True 


(b) 






Flase 


(c) 






True 


(d) 
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(a) 


5 


(b) 


9 


(c) 


6 


(d) 


10 


(e) 





(f) 


2 



3. 22 



5. 52 



7. a 2 -5a I 21 



9. -65 



11. -123 



A=1,A= -3 
(a) 



A= -2,A = 3,A = 4 
(b) 



16. 275 



17. 


= -120 
(a) 




= -120 
(b) 


18. 


3±/33 




A ~ 4 



22. Equals if n > l 

Supplementary Exercises (page 118) 



1/34/ 43 

1. x' = |i I ^y, / = - 1* I j7 



4. 2 



5 2 , 2 j 2 2 , j 2 2 

cos p = > cos 7 = — - 

Zac Zab 



12. det(5) = ( - i)K«-iy2 det(j4) 



13. The i\h and jth columns will be interchanged. 

(a) 

The /th column will be divided by c. 
(b) 



(c) 



c times the yth column will be added to the ith column. 



15. 



(a) 



A 3 I (-an- .322- <333) A 2 
I (a 11(322 I ^11^33 I fl22 fl 33- fl 12 fl 21 - fl 13 fl 31 - fl 23 fl 32)^ 
4-(flnfl23 fl 32 I a \2 a 2\ a ZZ I fl 13 fl 22 fl 31 - fl ll fl 22 fl 33 - fl 12 fl 23 fl 31 - fl 13 fl 21 fl 32) 



18. 



(a) a= -5,A = 2,A = 4; 



' -2t~ 




~5*~ 




' if 


i 


? 


t 


? 


m 


i 




i 




t 



(b) 



A=i; 



t 



Exercise Set 3.1 



(a) 



(b) 



(c) 



(d) 



(e) 



(f) 



(g) 



(h) 



P\?2 = 



P\Pl = 



P\P2 = 



P\Pl = 



P\P2 = 



P\P2 = 



P\P2 = 



P\Pl = 



page 130) 



-i,-i) 



-7, -2) 



2,1) 



a,b) 



-5,12, -6) 



1, -1, -2) 



-a, -b, -c) 



a,b,c) 



5. p( — 1 ; 2, — 4) is one possible answer. 

(a) 



(b) 



P(7 > —2, — 6) is one possible answer. 



6. (-2,1,-4) 

(a) 



(b) 



(-10, 6, 4) 



(c) 



(-7, 1, 10) 



(d) 



(80, -20, -80) 



(e) 



(132, -24, -72) 



(f) 



(-77, 8, 94) 



8. ci = 2> c-> = — 1. ct = 2 



10. c\ =C2 = C3 = 



12. 



(a) 



x' = 5,/ = : 



(b) 



x — — 1. y = 3 



15 - fj^I il _Ll _j® ^Jj^I JLu&l ^JiLtl Al±) 



U= ' 2 -2 r = "2-- 2 ' n + T = 



2 ' 2 J' U V= 1 2 ' 2 
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l. 



(a) 



(b) 



/13 



(c) 



(d) 



2^3 



3tf6 
(e) 




6 




(f) 




JS3 
(a) 




(b) 




4i/l7 
(c) 




J 466 




(d) 




f 3 6 

(e) [fii' fii- 


4 


1 





(f) 



9. f 2 4 

(b) \5'5 



(2 _3 6 
(c) V7' 7'7 



10. A sphere of radius 1 centered at (xQ,yQ,ZQ) 



16. a = c = 

(a) 

At least one of a or c is not zero, that is, a 1 I c 2 > 
(b) 



17. The distance from x to the origin is less than 1 . 

(a) 



ll*-*oll>l 

(b) 
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1. 


-11 
(a) 




-24 
(b) 





(c) 





(d) 


3. 


Orthogonal 
(a) 




Obtuse 
(b) 




Acute 
(c) 




Obtuse 
(d) 


5. 


(6,2) 
(a) 



11. 



(b) 



_2i _u\ 

13' 13 J 



[55 1 _ii 
(c) U3' ' 13 



73 __12 _32 1 l 
' 89' 89 j 



(d) \39 



8. (3k 2k) for any scalar k 

(b) 



) (s'sj'l 5' 5j 



(c 



i/10 3i/10 n 
cos B\ = -^-r- ' cos 2 = ' n ' cos 9 3 = 



13. ± (l / ^3, 1 / ^3, - 1 / /I) 



16. 


10 
(a) 3 




6 
(b) 5 




-60 + 34^3 




W 33 




1 
(d) 2 



...-if 2 



COS 



.fr 



COS ,tf = -ii^ip, cos 7 = 173 

(b) l|v|| IM 



27. The vector « is dotted with a scalar. 

(a) 

A scalar is added to the vector w. 
(b) 

Scalars do not have norms, 
(c) 

The scalar k is dotted with a vector. 
(d) 



29. No; it merely says that u is orthogonal to v — w- 



30. r = (u - r)— ^ + (v - r)— ^ + (w - r) — ^ 
Hull 2 llvll 2 ll^ll 2 



31. Theorem of Pythagoras 

Exercise Set 3.4 (page 153) 



1. (32, -6, -4) 

(a) 



(-14, -20, -82) 
(b) 



(c) 


(27, 40, • 


-42) 


(d) 


(0, 176, ■ 


-264) 


(e) 


(-44, 55 


,-22) 


(f) 


(-8, -3, 


-8) 


(a) 


{59 




(b) 


l/l01 




(c) 








7. For example, (1, 1, 1) x (2, - 3, 5) = (8, - 3, - 5) 



9. 


-3 
(a) 




3 
(b) 




3 
(c) 




-3 
(d) 




-3 
(e) 





(D 


11. 


No 
(a) 




Yes 
(b) 




No 
(c) 



13. / s _J$ 4_\ (__6 3_ 4 

15. 2(vxu) 



17. J26 

(a) 2 

(b) 3 



21. 


■/122 
(a) 

9k40°19" 
(b) 




23. 


m=(0, 1, 0)and n = 
(a) 

(-1,0,0) 
(b) 

(0 ,0, -1) 
(c) 


= (1,0,0) 


28. (- 


-8, 0, -8) 




31. 


2 

(a) 3 

1 

(b) 2 




35. 


u ■ w ^ 0? v ■ w = 
(b) 





36. No, the equation is equivalent to u x (v — w) = and hence to v — w = k\\ for some scalar k. 



38. The are collinear. 



Exercise Set 3.5 (page 162) 



1. -2(x i 1) i (y-3)-(z+2) = 

(a) 



(*-l) 1 90-1) 
(b) 


H8(z-4) = 


2z = 




(c) 




* + 2y + 3z = 
(d) 





3. (0, 0, 5) is a point in the plane and n = ( — 3, 7, 2) is a normal vector so that 

(a) _ 2{x — 0) i 7(y — 0) i 2 (z — 5) = is a point-normal form; other points and normals yield other correct 
answers. 

{x - 0) I 0(y - 0) - 4(z - 0) = is a possibility 
(b) 



5. Not parallel 

(a) 

Parrllel 
(b) 



Parallel 
(c) 



9. x = 3 + It, y = - 1 + t, z = 2 + It 

(a) 



x = - 2 + 6t, y = 3 - 6t, z = -3-2t 
(b) 



(c) 



x =£> y = —2t,z = 3t 
(d) 



x= -\2-7t,y= -4\-23t,z = t 
(a) 



{h) X = lt,y = 0,Z = t 



13. Parallel 

(a) 



Not parallel 
(b) 



17. 2* + 3.y-5zH-36 = 



19. z-z = 

(a) 

(b) 



(c) 



21. 5x-2.y I z- 34 = 

23. y \ 2z-9 = 

27. * + 5yH 3z-13 = 

29. 4x+13.y-z-17 = 

31. 3^-7 -z- 2 = 



(a) 23 + 23 '^ 23 23 ,z ~ t 



(b) X =~^ ,y = ° ,Z = t 



39. 


5 
(a) 3 




1 




(b) /29 




4 
(c) /J 


43. 


x-3 



(a) 2 ^ + 3 



x 1 2 _ _ y-3 _ z + 3 
(b) 6 6 2 



44. x — 2v — 17 = and x + Az — 27 = is one possible answer. 

(a) 



x — 2 v = and _ 7 v (. 2z = is one possible answer. 
(b) 



45. 3^35° 

(a) 

0^79° 
(b) 



47. They are identical. 

Exercise Set 4.1 (page 178) 



1. 


(-1,9,-11,1) 
(a) 




(22, 53, -19, 14) 
(b) 




(-13, 13, -36, -2) 
(c) 




(-90,-114,60,-36) 
(d) 




(-9, -5, -5, -3) 
(e) 




(27, 29, -27, 9) 
(f) 



3. c[ = 1. C2 = 1' C3 = — 1' £4= 1 



5. i/29 

(a) 



3 
(b) 

13 
(c) 





(d) 






8. 


*=*! 






10. 


(a) (/To' /To J 




/To 


14. 


Yes 

(a) 

No 
(b) 

Yes 

(c) 

No 
(d) 

No 
(e) 

Yes 






15. 


k= -3 
(a) 








k= —2>k = 
(b) 


-3 





19. ^i = 1> *2 = — 1» *3 = 2 

The component in the a direction is p r0 j u = — ( — \ 7 \ 7 2, 3); the orthogonal component is -r^- (34, 1 1, 52, — 27). 



22 



23. The do not intersect. 



33. Euclidean measure of "box" in ,£" ■ a ^ & n 

(a) 



(b) 



Length of diagonal: J J. , 2 , T~~2 



35. d(u,v) = J2 

(a) 



37. True 

(a) 



True 



(b) 



False 



(c) 



True 



(d) 



(e) 



True, unless u = 



Exercise Set 4.2 (page 193) 



1. Linear; p^ » p 2 

(a) 



(b) 



Nonlinear; p 2 > p} 



(c) 



Linear; ^3 ^ ^3 



(d) 



Nonlinear; p 4 - t p 2 



3 


5 


-1" 


4 


-1 


1 


3 


2 


-1 



T(-l,2,4) = (3, -2, -3) 



(a) 






f 


1 





1 


3 


1 


-1 



(b) 



7 2 


-1 f 


1 


1 


1 






(c) 

















































(d) 





1 


1 

1 






f 








1 











-1 






(a) 



(b) 



T(-l,4) = (5,4) 



7(2,1, -3) = (0, -2,0) 



9. (2, -5, -3) 

(a) 



(b) 



(c) 



(2, 5, 3) 



(-2, -5, 3) 



13. 



-2, 



(a) i - 2 
(0, 1, 2/2) 



j/3-2 1 I 2^3 
2 



(b) 



(c) 



(-1,-2,2) 



(a) I A 2 ' 2 



(b) 



(c) 



(-2/2,1,0) 



(1, 2, 2) 



17. 



(a) 





1/2 -^3/2 



(b) 



(c) 



fi {l 


{2 {2 


1 0" 


-1_ 



19. 


(a) 


^3/3 
1/3 


-^3/16 
3/16 









1/8 




(b) 


"0 
-1 



0" 

-1 






(c) 


1 


-1 


0" 
-1 





21. 


(a) 


res 







1/16 

-/3/16 

^3/3 



No 



(b) 



24. 



^(1-cos0)+cos0 ^(l-cos^)--^sin^ \{\ - cos (3) - -j= sin£ 

3 3 ' ^3 3 V 3 

^(l-cos#)--j=sin# ^r(l-cos9) + cos9 ^0 - costf) - -j= sintf 

3 ' ^3 3 3 V 3 

^(1 -costf) --j=sin0 ^r(l-cos0)--^sin9 ^0 - cos3 ) + cos0 

3 ^3 3 ' V 3 



28. 90° 

(c) 



29. Twice the orthogonal projection on the x-axis 

(a) 



Twice the reflection about the x-axis 



(b) 



30. The x-coordinate is stretched by a factor of 2 and the y-coordinate is stretched by a factor of 3. 

(a) 



(b) 



Rotation through 30° 



31. Rotation through the angle 20 



34. Only if £ = 0- 



Exercise Set 4.3 (page 206) 



1. Not one-to-one 

(a) 



One-to-one 



(b) 



One-to-one 



(c) 



One-to-one 



(d) 



(e) 



(f) 



(g) 



One-to-one 



One-to-one 



One-to-one 



3. For example, the vector (1, 3) is not in the range. 



(a) 



One-to-one; 



\T 1 (^1 ? ^2) 



12 11 



Not one-to-one 



(b) 



(c) 



One-to-one; 



-1 

1 



,-1 



; Z (wi,wr) = (-W7, -wi) 



Not one-to-one 



(d) 



7. Reflection about the x-axis 

(a) 



(b) 



Rotation through the angle —xfA 



Contraction by a factor of ^ 
(c) 3 



(d) 



Reflection about the yz-plane 



(e) 



Dilation by a factor of 5 



9. 



Linear 



(a) 



Nonlinear 



(b) 



Linear 



(c) 



Nonlinear 



(d) 



12. 



/ a \ For a reflection about the y-axis, T(e\) = 



(b) p or a reflection about the ^z-plane, T(?{) = 



and T(e 2 ) = 



. Thus, T = 



-1 

1 



T 




0" 





. n*2> = 


-1 











, and 7(e 3 ) = 



. Thus, T — 






0" 


-1 








1 



s Q \ For an orthogonal projection on the x-axis, T(v i ) = 



and T(e 2 ) = 



. Thus, T — 



1 




(d) p or an orthogonal projection on the yz-plane, T(e\) = 



T(* 2 ) = 



and T(e 3 ) = 



. Thus, 



T = 



"0 





0" 





1 











1 



(e) ^ or a rota ti° n through a positive angle 0, T(e \ ) = 

cos 9 — sin0 
sm9 cos9 



cos 9 
sintf 



and T(e 2 ) = 



- sinfl 
cgs0 



. Thus, 



T = 



W For a dilation by a factor £ > 1 , T(e i ) = 



V 




"0" 





. n*2) = 


jt 











and 7*(e 3 ) = 



. Thus, T = 



~k 


0" 


t 








k 



13. 



(a) 7*(»l) = 



and 7*(e 2 ) = 



Thus, 7 = 



-1 




(b) r(ei) = 



(c) ?*Cei) 



0" 


and 7(e 2 ) = 


T 
_0_ 


. Thus, T = 


f 
_-l 0_ 


"0" 
_3_ 


and 7*(e 2 ) = 


"0" 
_0_ 


. Thus, T = 


"0 0" 
3 0_ 





16. Linear transformation from j? 2 > £ 3 ; one-to-one 

(a) 



(b) 



Linear transformation from R* -, R 2 ; not one-to-one 



17. fl 1 

(a) U'2 



(b) (f.yj 

M-5^3 15-jg 
(c) 4' 4 



19. 





[0] 






V 


(a) A=l; 


s 


A = -1 







i 









~s~ 




"0" 




(b) A=l; 



t 


A = 


t 






(c) 



A = 2; all vectors in r} are eigenvectors 



(d) A=l; 



23. 



(a) 



cos 20 sin 20 

sin 26* — cos 20 



l + 5i/3 1/3-5 \ 
(b) [ 2 ' 2~ j 



27. The range of T is a proper subset of R". 

(a) 



(b) 



T must map infinitely many vectors to 0. 



Exercise Set 4.4 (page 217) 



x 2 I 2x - 1 — 2 f3jr 2 \-2\= - 5x 2 h 2x - 5 
(a) ^ / 



(a) 



Yes;^ = 



~1 





o" 





1 











1 












(a) 



L: P\ * P\ where L maps ax I b to ( a \ b)x I a -b 



3^ + 3e _f = 6cosh(0 
(a) 



Yes 



(b) 



12. y = 2x : 



14. y= X i -X 

(a) 



15. y = 2x 3 - 2x + 2 

(a) 



18. No, because of the arbitrary constant of integration 

(a) 



(b) 



No (except for p^) 



21. Each Lj(x) is a polynomial of degree at most n and hence so is the sum y$L(x) -\ h y yi L(x)', also, 

^ p(.*i)=Q I I '" I I yi-Li(.*i) I I '" I I 0=,y,, showing that this function is an interpolant of 
degree at most n. 



(b) 



It is Z H+ ic = v where c is the vector of Cj values and y is the vector of y-values. 



Exercise Set 5.1 (page 226) 



1. Not a vector space. Axiom 8 fails. 



3. Not a vector space. Axioms 9 and 10 fail. 

5. The set is a vector space under the given operations. 

7. The set is a vector space under the given operations. 

Not a vector space. Axioms 1, 4, 5, and 6 fail. 

11. The set is a vector space under the given operations. 

13. The set is a vector space under the given operations. 

25. No. A vector space must have a zero element. 

26. No. Axioms 1, 4, and 6 will fail. 



29. 


Axiom 7 
1. 




Axiom 4 

2. 




Axiom 5 
3. 




Follows from statement 2 
4. 




Axiom 3 

5. 




Axiom 5 
6. 




Axiom 4 
7. 



32. No; d =0i + 2 = 2 

Exercise Set 5.2 (page 238) 

1. (a), (c) 



3. (a), (b), (d) 
5. (a), (b), (d) 



Line; x = -h,y = -h, z = t 
(a) *■ ^ 



(b) 


Line; x = 
Origin 


2U y — U 


z- 


:0 




(c) 


Origin 










(d) 












(e) 


Line; x = 


■■ -3t,y = 




■ It, z - 


= 1 


(f) 


Plane; * . 


-3y+z = 


= 







- 9 -Ix - 15x 2 = - 2i»i + i>2 - 2p3 
(a) 



6 -I- 1 lx 4- 6x 2 = 4pi - 5p2 4- P3 



(b) 

= 0pi+0p 2 + 0p3 



(c) 
(d) 



7 + 8x + 9x 2 = Opi - 2p 2 4- 3p3 



11. The vectors span, 

(a) 

The vectors do not span. 
(b) 

The vectors do not span, 
(c) 

The vectors span. 
(d) 



12. (a), (c), (e) 



15. y=z 



24. They span a line if they are collinear and not both 0. They span a plane if they are not collinear. 

(a) 



(b) 



If \i = av and v = bn for some real numbers a, b 



(c) 



We must have \, — since a subspace must contain x — and then ]) — ji0 — 0- 



26. 



(a) 



For example, 



"1 0" 




"0 1" 




"0 0" 




"0 0" 


_0 0_ 


? 


0_ 


5 


1 0_ 


1 


1_ 



(b) 



The set of matrices having one entry equal to 1 and all other entries equal to 



Exercise Set 5.3 (page 248) 



1. U2 is a scalar multiple of ui . 

(a) 



(b) 



The vectors are linearly dependent by Theorem 5.3.3. 



(c) 



j>2 is a scalar multiple of j^. 



(d) 



B is a scalar multiple of A. 



3. None 



5. They do not lie in a plane, 

(a) 



(b) 



They do lie in a plane. 



2 3 7 3 7 2 

( b ) vi = yv 2 - yv 3 . v 2 = j vi + ^ V3 ' V3 = ~ J Vl + J V2 



"•A=-I,A=, 



18. If and only if the vector is not zero 

19. They are linearly independent since vi, v r 2, and V3 do not lie in the same plane when they are placed with their 

(a) initial points at the origin. 

The are not linearly independent since y 1? \'2, and V3 lie in the same plane when they are placed with their 

(b) initial points at the origin. 



20. 


(a), (d), (e), (f) 


24. 


False 
(a) 




False 
(b) 




True 
(c) 




False 
(d) 


27. 


Yes 
(a) 



Exercise Set 5.4 (page 263) 



1. A basis for g? has two linearly independent vectors. 

(a) 

A basis for p? has three linearly independent vectors. 
(b) 

A basis for p-j has three linearly independent vectors. 
(c) 



A basis for M77 ^ as f° ur linearly independent vectors. 
(d) 



3. (a), (b) 



(w)j=(3. -7) 
(a) 



(b) W*-(^' M 



9- (v) s =(3,-2,l) 

(a) 



(v)s=(-2,0,l) 
(b) 



11. (^=(-1,1, -1,3) 

l3 - Basis: | - 1 - 1 1, o\ (0, -1, 0, 1); dimension = 2 

15. Basis: (3, 1, 0), (-1, 0, 1); dimension = 2 



19. 3-dimensional 

(a) 

2-dimensional 
(b) 

1 -dimensional 
(c) 



20. 3-dimensional 



, , {vi.»2. e l) or {vi,v 2 , e 2 } 
(a) 



n , {»1. v 2, ^l) or {vi, v 2 , <? 2 } or {vi, v 2 , <?3) 
(b) 



One possible answer is J - 1 I x - 2x , 3 4- 3x + 6x , 9 I. 

(a) L J 

One possible answer is \ 1 \ x,x , —2 \ 2x y 

(b) L J 

. . One possible answer is \ 1 I * — 3x k 

(c) L J 



29. (2, 0) 

(a) 



(b) ik ~ h, 

(0,1) 
(c) 



<d) W' b -fh 



31 

Yes; for example, 


"l 




0" 

1 1_ 


? 


o r 

_±1 0_ 


32. n 




(a) 




n(n+ l)/2 
(b) 




»(» + !)/: 


> 









(c) 



35. The dimension is n — 1 • 

(a) 



(b) 



(1, 0, 0, ... , 0, -1), (0, 1, 0, ... , 0, -1), (0, 0, 1, ... , 0, -1), ... , (0, 0, 0, ... , 1, -1) is abasis of size „ _ ]. 



Exercise Set 5.5 (page 276) 



ri = (2, - 1, 0, 1), r 2 = (3, 5, 1, - 1), r 3 = (1, 4, 2, 7); ci = 



"2" 




"-1" 




"0" 




1" 


3 


»*2 = 


5 


, c 3 = 


7 


, c 4 = 


-1 


1 




4 




2 




7 



3. 



(a) 



' -2 
10_ 


= 


"1" 
_4_ 


— 


3" 
_-6_ 



(b) 



b is not in the column space of A. 



(c) 



"1" 




"-1" 




"1" 




5" 


9 


-3 


3 


1 


1 


= 


1 


1 




1 




1 




-1 



(d) 



(e) 



2 




1 





= 


1 







-1 



I (<-l) 



= -26 



-1 




1 


1 


+ t 


-1 


-1 




1 



T 




~2 




"o" 




T 





+ 13 


1 


-7 


2 


1 4 


1 


1 




2 




1 




3 







1 




2 




[2J 



(a) 



"1" 


1 / 


~3~ 


; t. 


"3" 


|_0 J 




[lj 




L 1 J 



(b) 



(c) 



(d) 



"-2" 




~-1~ 




' -\ 


7 


+ £ 


-1 


\t 


-1 







1 




1 



~-l" 




~2 




"-1" 




"-2" 




'2 




"-1" 




"-2" 





+ r 


1 


+ .<? 





+ tf 





• r 


1 


1 ff 





+ * 





U 









1 




u 




u 




1 




u 

















1 














1 



~ 6~ 
5 




'l' 
5 




r 
5 




"7" 

5 




r 
5 


1 
5 


+ s 


4 
5 


+ t 


3 
5 


;s 


4 
5 


-\-t 


3 
5 







1 









1 

















1 









1 



< a > n = [l 2]»r 2 =[0 l]» c = 



(c) 



"1" 




"2" 





»=2 = 


1 











(b) 



ri = [ 1 -3 0].i'2=[0 1 0].ei = 



"1" 




"-3" 






'^2 = 


1 












ri = [1 2 4 5].i'2=[0 1 -3 0]>r3=[0 1 -3].i'4=[0 l],ci = 



"1" 




"2" 







1 





'C2 = 






















c 3 = 



4" 




5" 


-3 







1 


,c 4 = 


-3 







1 











(d) 



ri = [ 1 2 -1 5].i'2=[0 1 4 3].r3=[0 1 -7]>r 4 =[0 l].ei = 



"1" 




"2" 





' c 2 = 


1 



















c 3 = 



~-l" 




5~ 


4 
1 


,c 4 = 


3 
-7 







1 



9. 



(a) 


"1" 
5 


? 


-4 


(b) 


7 

"2" 
4 





-6 



(c) 



(d) 



f 




"4" 


2 


? 


1 


-1 




3 



r 




4~ 


3 




-2 


-1 


' 





2 




3 



(e) 



f 




"-3~ 




2~ 







3 




6 


2 


? 


-3 


? 


-2 


3 




-6 







-2 




9 




2 



u - (1, 1, -4, -3), (0, 1, -5, -2), (0, 0, 1, - 1 J 

(1,-1, 2,0), (0,1,0,0), |o,0, 1, -1J 



(b) 



(c) 



(1,1, 0,0), (0,1, 1,1), (0,0, 1,1), (0,0, 0,1) 



14. 



(b) 



"0 





0" 





1 











1 



17. 



3a — 5a 
3b -5b 



for all real numbers a, b not both 0. 



Exercise Set 5.6 (page 288) 



Rank( J 4)=rank(,4 :r ) = 2 



2:1 



(a) 



(b) 



1;2 



(c) 



2; 2 



(d) 



2; 3 



3:2 



(e) 



Rank = 4, nullity = 
(a) 



(b) 



Rank = 3, nullity = 2 



(c) 



Rank = 3, nullity = 



7. Yes, 



(a) 



No 



(b) 



(c) 



Yes, 2 



Yes, 7 



(d) 



(e) 



(f) 



No 



Yes, 4 



Yes,0 



(g) 



9. & i = n &2 = s ' ^3 = 4s — 3n b$ = 2r — s> b*> = 8s — Ir 



11. No 



13. Rank is 2 if r = 2 and s = 1 ; the rank is never 1 . 



16. 



(a) 



"1 





0" 





1 















(b) 



A line through the origin 



(c) 



A plane through the origin 



(d) 



The nullspace is a line through the origin and the row space is a plane through the origin. 



19. 



(a) 



(b) 



(c) 



(d) 



Supplementary Exercises (page 290) 



1. All of R 3 

(a) 



(b) 



Plane: 2x - 3y f z = 



(c) 



Line: x = 2t, y = t,z = 



(d) 



The origin: (0, 0, 0) 



a(4, 1, 1) I 6(0, -1,2) 
(a) 



(b) 



(a I c)(3, -1,2) I 6(1,4,1) 



5. 



a(2,3,0) I £(-1,0,4) I c{4, -1,1) 
(c) 



(a) v = ( - 1 I r)vi I f| - rjv 2 I rv 2 ; r arbitrary 



7. No 



9. 


(a) 


Rank = 


: 2, nullity = 1 




(b) 


Rank = 


= 3, nullity = 2 




(c) 


Rank = 


= m _l_ 1 , nullity = « 


11. 


{..* 


W, 


* 5 ,* 6 ,. ..,*"} 


13. 


(a) 


2 






(b) 


1 






(c) 


2 






(d) 


3 
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(a) 


2 


(b) 


11 


(c) 


-13 


(d) 


-8 







(e) 



3. 3 

(a) 



56 



(b) 



5. 29 

(b) 



(a) 


fe 
fi 


(b) 


"2 " 
{I 



9. No. Axiom 4 fails. 

(a) 



No. Axioms 2 and 3 fail. 



(b) 



Yes 



(c) 



No. Axiom 4 fails. 



(d) 



11. 



(a) 



3^2 



(b) 



3f5 



(c) 



3/l3 



13. 



(a) 



^74 



(b) 



15. 


^105 
(a) 




i/47 
(b) 


17. 


ia) f2,±f6,ijro 




(b) 3 V 



19. , , 1 

( u - v } = "9 U1V1 + U2V2 



'**• No for Py since p = x \x — — \{x — 1) satisfies (p, p} = 



27. _28. 

(a) 15 


(b) 



34. a = 1 1 25, b = lt 16 



Exercise Set 6.2 (page 315) 



1. 

(a) 


Yes 


(b) 


No 


(c) 


Yes 


(d) 


No 


(e) 


No 


(f) 


Yes 



(a) 


ft 




3 


(b) 


fn 







(c) 






20 


(d) 


9/T0 


(e) 


1 
~f2 




2 



(f) {55 



9. Orthogonal 

(a) 



(b) 



Orthogonal 



(c) 



Orthogonal 



(d) 



Not orthogonal 



±^r(-34,44, -6,11) 



x = t,y= -2t,z= -3t 
(a) 



(b) 



2x - 5y + Az = 



x-z = 



(c) 



17. 



(a) 



"1" 




"0" 




2" 


3 


5 


1 


9 


-1 


1 




1 




1 



18. 



(a) 



(16, 19, 1) 



(b) 



(c) 



(d) 



(0,1,0), (1,0,1 



(-1,-1,1,0), 



2 -i l) 
7' T ' j 



(-1,-1,1, 0, 0), (-2, -1, 0, 1, 0), (-1, -2, 0, 0, 1) 



32. , . 1 1 

(u, v) = ^"lvi 4- -gU 2 V2 



35. The line y — — x 

(a) 



37. 



(b) 



The x^-plane 



The x-axis 



(c) 



(a) 



False 



True 



(b) 



True 



(c) 



False 



(d) 
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1. (a), (b), (d) 



(b), (d) 



5. (a) 



1 2 



2 1 






(c) 



1 1 1 



l i W i i 



yr^'/j • w rwf*' &, 



9. 7 1 , ~ 

(a) -5Vi+ y v 2 + 2v 3 



37 9 

(b) -f Vl -5 V2l4v3 



3 1 5 

(C) " 7 Vl - 7 V2 ' 7 v 3 



U ' (a) W ff =(-2^2.5/2) 



(b) 



(w)^=(0, -2,1) 



13 - w -('-*-iM*H?-fM 



_ 4 _Ii 23 
5 ' 5 



(b) 



|v|| = ,/T3, rf (ii, v) = 5^3, (iv, v} = 13 



15. 4 11 . n 1 

(b) n= -r i -To va ' °^ + r 4 



(a) (^^^M"^^°}(^^"^ 



(b ) (1,0,0), o,-^ 



2 W Q 2 7 



/53' /53 J' ^ /53 ' /53 . 



19. 



j 2_^ f_jd __2 l 



ft' ft}' \ ft' /30 ' /30 



21. 



wi = 



-f 2 -!)H^f) 



24. 



(a) 



1 


2 


\fifi\ 


2 


1 


fi 



(b) 



1 


1 




& 


f 




o 


1 


{l 3/2 




p 


{3 


1 


1 


L J 


[fi 


^J 





(c) 



1 8 

3 /234 

2 11 

3 /234 

2 7 

3 /234 



^26 



(d) 



1 


1 


1 1 


^ 


fi 


^ 


n 


1 


2 


u 


^ 


^ 


1 


1 


1 


^ 


^ 


^J 



/2 /2 


^1 


{3 


1 





4 
/6_ 



(e) 



_i_ .li- 



fe 2/l9 /l9 



{2 2/l9 /l9 

*H -L- 



fi -7= 1/2 

v /2 y 

/2 /l9 







1 



(f) 



19 |/19 

Columns not linearly independent 



29. 



vi 



^-^H-^- 1 ) 



;1 - vi = 1. v 2 = ^3(2* - 1)' v 3 = ^( 6 * 2 - &r + l) 
35. (l / /5, 1 / ^5~l (2 / /30, - 3 / /30) 
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1. 



(a) 



21 25 
25 35 



*2 



20 
20 



(b) 



5 


-1 5" 


~*l" 




~-f 


1 


22 30 


*2 


= 


9 


5 


30 45 


*3 




13 



(a) 



x\ = 5, *:> = -; 



11 

2 

_ 9 

2 

-4 



(b) 



X[=-,X 2 





46 ~ 
21 


2. 

T 


5 
21 




13 
21 



(c) 



X] = 12> x?= —3>x^ = 9; 



(d) 



*1 = 14, X2 = 30, X3 = 26; 



3 
3 
9 


2 

6 

-2 

4 



5. (7, 2, 9, 5) 

(a) 



12. _ 4 12. 16 
(b) I 5 ' 5' 5 ' 5 



7. 



(a) 



(b) 



"1 


0" 


_0 


0_ 


"0 


0" 





1_ 



11. 



(a) 



vi = (2, -1,4) 



(b) 



4 


2 


8 


21 


21 


21 


2 
21 


1 

21 


4 
21 


8 


4 


16 


21 


21 


21 



(c) 



2f X0 ~ 2f yo + 2\ Z0 
2 1 4 

" 2\ X0 + 21 y ° " 2l Z0 

£*0 - ^0 I ^f*0 



^497 
(d) 7— 



17. 



[^]=,4 r (A4 r ) ,4 



18. Since A T = 

1. 



2. 



Since j[ T j[ is invertible 



3. 



Since the nullspace of A is nonzero if and only if the columns of A are dependent 



Exercise Set 6.5 (page 345) 



1. 


(a) [ w ]*= 


3 
_-7 










(b) 

[w]-r= 


" 5 " 
28 

3 
14 








(C) [ W ]j = 


b — a 
2 






3. 




4 




(a) (p)*=(4, -3,1), [p]j = 


-3 
1 






(b) (p)*=(0,2, -1), [p]j = 


2 
-1 


5. 


w=(16, 
(a) 


10,1 


2) 







(b) 



q = 3 + 4x" 



(c) 



5 = 



15 
6 



1 
3 



(a) 



(b) 



(c) 



11 
10 

_2 
5 



1 
2 





-4 



-2 -^ 



5 
2 

12 
2 



[«"]= = 



XL 
10 



Wb' = 



-4 
-7 



(a) 



(b) 



2 4 



-2 

5 

_2 
2 

23 
2 
6 



-3 

1 



11. 



(b) 



2 
1 3 



1 





2 




1 


1 


6 


3 



(c) 



(d) [h] s = 



[h]£< = 



1 

2 
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(b) 



4 
5 



3 
5 



9 


12" 


25 


25 


4 


3 


5 


5 


12 


16 


25 


25 



3. 



(a) 



1 
1 



7. 



(b) 



(d) 



(e) 



1 

f2 


1 

f2 


1 

f2 


1 

f2 



1 

ft 


"f 
ft 


1 

fe 


2 1 


1 

/3 


1 1 

f2 fl 



1 


1 


1 


2 


2 


2 


1 


5 


1 


2 


6 


6 


1 


1 


1 


2 


6 


6 


1 


1 


5 


2 


6 


6 



} (-1 I 3/3,3 l73) 



(a 



<„) (f-^f^ 1 



9 - (a, (-i-f^- 2 -!-^) 



1 3 



(b) \2"2^' 6 '-2-2^ 



11. 



(a) A = 



(b) ,4 = 



cosfl 


— sin# 


1 





sin0 


cos 9 


1 





cost? 


sin0 


— sin# 


cos0 



12. 



J2 i[e 

4 4 

2 2 

/2 ^6 



_J2 

2 


2 



16. Rotation 

(a) 



Rotation followed by a reflection 
(b) 



20. Rotation and reflection 

(a) 

Rotation and dilation 
(b) 

Any rigid operator is angle preserving. Any dilation or contraction with £ * 0> 1 is angle preserving but not 
(c) rigid. 



22. a = 0, b = {2J3, c = - f\H> or a = Q, b = - {liz, c = f\H> 
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(O.fl, a, 0) withfl^O 
(a) 



(b) 






0, -t=. -k 



* ± -ko. i 



.1/2"" ifo 






11. 9 approaches -^ 



12. The diagonals of a parallelogram are perpendicular if and only if its sides have the same length. 

(b) 



Exercise Set 7.1 (page 367) 



1. A 2 -2A-3 = 

(a) 



3. 



(b) 
(c) 
(d) 
(e) 
(f> 



A 2 -8A4- 16 = 



A^-12 = 



A^ + 3 = 



A 2 = 



A^-2A+1 = 



(a) Basis for eigenspace corresponding to A = 3'- 

(b) Basis for eigenspace corresponding to A = 4- 



basis for eigenspace corresponding to A = — 1 : 



^ Basis for eigenspace corresponding to A = Jl2'- 
3 
1 



if\2 
1 



; basis for eigenspace corresponding to A = — J\2'- 



(d) 



There are no eigenspaces. 



, -> Basis for eigenspace corresponding to A = 0: 



(f) 



(a) 



Basis for eigenspace corresponding to A = l: 



"1" 




"0" 


_0_ 


? 


_1_ 


"1" 




"0" 


_0_ 


? 


_1_ 



A = 1 : basis 



(b) 



A = 0: basis 



; A = 2- basis 



1 
2 
1 
1 



; A = 3'- basis 



A = ^2: basis 



1(15 + 5/2) 
i(-1 + 2/2) 



; A = - ^2: basis 



1(15-5/2) 
I(-l-2/2) 



(c) 



A = — 8: basis 



(d) 



A = 2'- basis 



(e) 



A = 2'- basis 



-2 



(f) 



A = — 4: basis 



; A = 3'- basis 



5 

-2 

1 



A=1,A= -2,A= -1 
(a) 



A = 4 



(b) 



A= -1,A = 5 
(a) 



(b) 



A=3,A = 7,A=1 



(c) 



A — — ^?A — 1? A — 



13. y = x and y = 2* 

(a) 



No lines 



(b) 



(c) 



y = 



14. -5 

(a) 



7 



(b) 



22. 



(a) 



Ai = i: 



"1" 




"l" 




"1" 




1 


;*.-!= 


2 
1 


;*-£ 


1 
1 














(b) 



Ai= -2- 



1 

1 



;a 2 = -l: 



I 

2 
1 




; A 3 = 0: 



(c) 



Ai =3: 



"1" 




"l" 
2 




"1" 





;A 2 = 4: 


1 


; a 3 = 5: 


1 


1 








1 














25. A is 6 x 6 

(a) 



A is invertible. 



(b) 



(c) 



A has three eigenspaces. 
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1. A=0:lor2;A=l:l;A = 2:l,2, or 3 



3. Not diagonalizable 



5. Not diagonalizable 



7. Not diagonalizable 



P = 



"1 





3 




1 


1_ 



■P~ l AP = 



11. 



P = 



2 


o r 





1 


1 






■P~ l AP: 



"3 





0" 





3 











2 



13. 



P = 



"1 


2 


r 


1 


3 


3 


1 


3 


4 



•P" 1 ^: 



"1 


0" 


2 








3 



16. Not diagonalizable 



17. 



P = 



1 


1 





o" 





1 


1 











1 


1 











1 



■P~ l AP: 



2 












o" 


-2 











3 











3 



19. 



-1 10237 -2047 
1 

10245 -2048 



21. 



1 


1 


1] 


2 





-1 


1 


-1 


lj 



A n = PD n P~ l = 



and ^2 are as in Exercise 18 of Section 7.1. 



r o 


ol 


3" 








A n \ 



1 


1 


1 


6 


3 


6 


1 
2 





1 
2 


1 


1 


1 


3 


3 


3 



One possibility is p = 



-b -b 
a — X\ a — \2 



where Aj 



25. 



False 



(a) 



False 



(b) 



True 



(c) 



True 



(d) 



True 



(e) 



27. Eigenvalues A, must satisfy — 1 <■ A < 1 . 

(a) 

If A = PDF _1 with D diagonal, then lim A = PD P , where D f is obtained from D by setting all 
(b) k — ■> +00 

diagonal entries that are not 1 to 0. 



Exercise Set 7.3 (page 383) 



1. x 2 _ 5A = 0; A = 0- one-dimensional; A = 5- one-dimensional 

(a) 



(b) 



A 3 — 27A — 54 = 0; A = 6- one-dimensional; A = — 3' two-dimensional 



(c) 



A 3 _ 3^2 _ q; A = 3: one-dimensional; A = 0: two-dimensional 



(d) 



A — 12A + 36A — 32 = 0; A = 2- two-dimensional; A = 8: one-dimensional 



(e) 



A 4 — 8A = 0; A = 0: three-dimensional; A = 8: one-dimensional 



(f) 



A 4 — 8A 3 4- 22A 2 — 24A -f 9 = 0; A = l: two-dimensional; A = 3- two-dimensional 



3. 



P = 



2 


H] 


F 


f> 


Jl 


2 


f> 


•P\ 



■P~ l AP = 



3 
10 



P = 



1 

4 o % 



■P~ l AP: 



25 











-3 











-50 



p = 



9. 



P = 



1 


1 

fe 


1 


2 


1 


1 

fe 


4 
5 


3 
5 


3 
5 


4 
5 



1 



"0 0" 
3 


1 


3 



0-^4 



oo 4 ^ 















4 
5 


3 
5 


3 
5 


4 
5 





"-25 








o" 


P~ l AP = 






25 




-25 

















25 



12. 



(b) 



-f ° 


1 
f2 


1 





-+ ° 


1 



15. 





"3 





0" 


Yes; take A — 





3 


4 







4 


3 
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1. The transformation rotates vectors through the angle 0; therefore, if < 9 < x, then no nonzero vector is 

(b) transformed into a vector in the same or opposite direction. 



(c) 



"1 


1 0" 





2 1 





3 



9 -A 2 = 



"15 30" 


A 2 = 


"75 150" 


A 4 = 


. 5 10 _ 


•> 


25 50_ 


5 



375 750 
125 250 



A> = 



1375 3750 
625 1250 



12. 



(b) 





1 
1 
1 



-1 

2 
-1 
-3 



13. The are all 0, 1, or -1. 



15. 



1 
1 









1 

2 


1 
2 


1 
2 


1 
2 



Exercise Set 8.1 (page 398) 



3. Nonlinear 



5. Linear 



9. Linear 

(a) 



Nonlinear 



(b) 



" 3 - T(xi,x 2 ) = j(3xi-x 2 , -9^i - 4^ 2 , 5^i I 10x 2 ); 7(2, - 3) = (|, - |, - ^ J 



15. T(xi,x 2 ,x 3 ) = (-41xi I 9x 2 I 24x 3 , 14xi -3^2-3^ 3 ); 7(7, 13,7) = (-2, 3) 



Domain: R 2 ; codomain: R 2 ; (J 2 Cl T\){x, y) = (2x - 3y, 2x I 3y) 
(a) 

Domain: R 2 ; codomain: R 2 ; (T 2 o T\)(x, y) = {Ax - \2y, 3x - 9y) 
(b) 

Domain: R 2 ; codomain: R 2 ; (J 2 <=> T{){x, y) = (2x | 3y,x- 2y) 
(c) 

Domain: R 2 ; codomain: R 2 ; (T 2 o T\)(x,y) = (0, 2x) 
(d) 



19. <* + d 

(a) 



(T2 o T\)(A) does not exist since T\ (A) is not a 2 x 2 matrix. 
(b) 



22. (7*2 q T\) fag l-aix + fl2X ] = (aq + fl l + fl 2)* + ( fl l -1 2^2)^ +^2*" 



(370(xi, x 2 ) = (6x1 -3^2, 3x 2 I 3xi) 
(b) 



28. 


No 
(b) 


31. 


x 2 1 3x 
(a) 




sin* 




(b) 




s x -\ 
(c) 



Exercise Set 8.2 (page 405) 

1. (a), (c) 



(a),(b),(c) 



5. (b) 



7 - W (H 



(b) (2 



^, -4, l,o) 



No basis exists. 



(c) 



11. 



(a) 



(b) 



(c) 



Rank(7) = 1, nullity (7) = 2 



(d) 



Rank(>4) = 1, nullity (^4) = 2 



13. 



(a) 



f 

3 

-1 

2 


' 


0" 

1 

2 

7 

5 

14 


? 


"0" 



1 



(b) _ 



~-l" 




~-l~ 


-1 




-2 


1 


J 


















1 



(c) 



Rank(70 = 3, nulfily(T) = 2 



(d) 



Rank(^) = 3, nullity^) = 2 



15. ker(70= {0};5(T) = F r 



17. Nu%(T) = 0, rank(r) = 6 



x= —t,y= —t,z = U — oo <t< 4- oo 
(a) 



14* - 87 - 5z = 
(b) 



25. ker(D) consists of all constant polynomials. 



27. ker(D o D) consists of all functions of the form ax \ b\ ker(D 0D0D) consists of all functions of the form 

ax 2 + bx + c 



30. DqDqDqD, where D is differentiation 

(a) 



Do Do — o D(h + 1 times) 
(b) 



Exercise Set 8.3 (page 413) 

1. ker(T) = {0} ; Tis one-to-one. 

(a) 

ker(T) = } k — — , 1 H; Tis not one-to-one 

ker(T) = {0} ; Tis one-to-one 
(c) 

ker(T) = {0} ; Tis one-to-one 
(d) 

ker(7) = {k(\, 1)} ; Tis not one-to-one 
(e) 

ker(7) = {jt(0, 1, - 1)} ; Tis not one-to-one 
(f) 



3. T has no inverse. 

(a) 



(b) 



*1 
*2 
*3 



1 1 3 

8 X1 'S 12 "? 3 

8 T1 ' h ' 4* 3 

3 5 1 

3*1 + 3*2 + ^3 



(c) 



,-1 



(d) T -l 







111 


- 


*1~ 

*2 
*3_ 


= 


2*1-2*2+2*3 

~h ' 2* 2 ' h 


*1~ 




3x\ 1 3*2 -*3 




*2 


= 


— 2*i — 2*2 1 *3 




*3 




— 4*i — 5*2 1 2*3 





ker(r)={tC-l,l)} 
(a) 



(b) 



Tis not one-to-one since ker(T) * {0} 



7. T is one-to-one. 

(a) 



(b) 



(c) 



T is not one-to-one. 



Tis not one-to-one. 



T is one-to-one. 



(d) 



11. 



(a) 



flj. #0 fori = 1,2,3, ... ,n 



(b) 



T - 1 (xi, * 2 , * 3 , -, * H ) = (^*1, ^"*2, ^"*3, -- ^*» 



13 ' (a) Tf 1 (^(*)) = ^; 77 1 coo) = ,(* - 1); era o ro" 1 (*(*)) = ^C* - D 



15. (1,-1) 

(a) 



,-i 



(d) 



T _1 (2,3) = 2 + * 



17. 



T is not one-to-one. 



(a) 



(b) 



(c) 



T is one-to-one. T 



T is one-to-one. T 



-1 


a b 




~a c~ 






c d 




_b d_ 




-1 


a b 




d 


-b~ 




c d 




— c 


a 



21. Tis not one-to-one since, for example, / (x) = x (x — 1) is in its kernel. 



25. Yes; it is one-to-one. 
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l. 



(a) 



(a) 











1 










1 







1 




1 


-1 


1 





1 


-2 








1 



(a) 






o" 


1 


1 


2 




3 


4 


3 


3 



(a) 



(b) 



"1 


1 


f 





2 


4 








4 



3 + 10x+ 16*' 



(a) [?X v l)]£ : 



(b) r(*i) = 



[T(v 2 )] B = 



3 
5 



> T<?2) = 



-2 
29 



(c) 



(d) 



*1 
* 2 

12. 
7 

83 
7 



18 


r 




7 


7 


*1 


107 


24 


*2 


7 


7 





11. 



(a) 



(b) 



"1" 




3" 




"-1 


2 


. [7*(y 2 )]b = 





. [^CV3)] 5 = 


5 


6 




-2 




4 



[7Xvi)] £ = 



T(vi) = 16 -I- 51* 4 19* 2 ' T(v 2 ) = - 6 - 5* I 5* 2 , T(v 3 ) =7 + 40* 4- 15* 2 



(c) 



(d) 



™/ , , 2\ 239ao-161ai I 289a 2 
T[ao + aix-\ a 2 x j = u ^4 



201an-lllai I 247fl? 

t" w-w ^ 



Wl 4 * 2 ) = 22 4- 56x 4 14x 2 



61flQ — 31fli I 107i32 2 
12 x 



13. 



(a) 



IT2 ° 7^1 ] £',B = 



"0 


0" 


6 














-9 



[T 2 ] 



B',B' 



"0 





0" 


3 











3 











3 



, [T'l ]*",£ = 



"2 


0" 











-3 



(b) 



[7*2 o Ti ] fl * jfl = [T2] 5 ' ;B ''[7i] B » ;B 



19. 





"0 





0" 


(a) 





-1 







1 




"0 


0" 




(b) 





1 









2 






"2 


1 0" 




(c) 





2 2 









2 





(d) l4e 



2x -8*e 2 * 



20* V* since 



"2 1 0" 


4" 




14" 


2 2 


6 


= 


-8 


2 


-10 




-20 



21. 



/ D» 



(a) 



B',B 



I nttt 



(b) 



B',B 



22. We can easily compute kernels, ranges, and compositions of linear transformations. 
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[T] B = 



1 -2 
-1 



AT] 3 ' = 



3_ 
11 
_2_ 
11 



56 
11 

3_ 
11 



[T] B = 



1 


1 


,[T] B ' = 


/f 


7f_ 





13 



25 



ll/2 ll/2 

5 9 



ll/2 ll/2 



[r] fl = 



"1 0" 




"1 0" 


1 


,[T]b> = 


1 1 











8. det(r) = 17 

(a) 



(b) 



det(T) = 



(c) 



det(7) = 1 



10. 



(a) 



[T] B = 



T is one-to-one. 



1 


1 


1 


1 


f 





2 


4 


6 


8 








4 


12 


24 











8 


32 














16 



, where B is the standard basis for p^, rank (T) = 5 and nullity (T) = 0- 



(b) 



12. 



(») «; = 



( fe ) «; = 



"-f 




"1" 




1 


' u 2 = 





t 
, u 3 = 







1 




"-1" 




"1" 




1 


,u 2 = 





. U3 = 







1 





( c > «; = 



"1" 




"0" 




"-1" 


2 


,v! 2 = 


1 


, U3 = 





1 









1 



14. A=1,A= -2,A= -1 

(a) 



(.-. Basis for eigenspace corresponding to \ = \: 




1 



and 



2 3 
1 



basis for eigenspace corresponding to \ — —2- 



-1 
1 



basis for eigenspace corresponding to \ — —\. 



2 1 
1 



21. £ — p~ l jip is similar to A. 

1. 



The distributive law for matrices 



4. 



The determinant of a product is the product of the determinants. 



5. 



The commutative law for real multiplication 



det(F -1 ) = l/det(P) 



23. The choice of an appropriate basis can yield a better understanding of the linear operator. 
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2. When A is noninvertible. 



5. No (not onto) 

(a) 



Yes 



(b) 



(c) 



No (not one-to-one) 



(d) 



No (not one-to-one) 



11. 



The matrix is 





















-1 








1 





























2 






-2 




Supplementary Exercises (page 446) 

1. No. T(xi 4 x 2 ) = ^(xi 4- x 2 ) 4- B * (Ax\ + B) + {As. 2 4- B) = T(xi) 4 T(x 2 ), and if c * 1, then 
T(cx) = c^4x 4- B ± c(Ax 4 B) = cT(x)- 



5. 7(e3) and any two of T(e\), 7(e 2 )> anc * ^C e 4) ^ orm bases for the range; (-1, 1,0, 1) is a basis for the kernel, 

(a) 



(b) 



Rank = 3, nullity = 1 



Rank (7) = 2 and nullity (T) = 2 
(a) 



7 is not one-to-one. 



(b) 



11. Rank = 3, nullity = 1 



13. 



1 








0" 








1 








1 

















1 



vi = 2ui 4 U2> v 2 = - ui 4- 112 4 113* V3 = 3ui + 4u 2 + 2u3 
(a) 



(b) 



111 = — 2\'i — 2v 2 4 V3> u 2 = 5\*i I 4v 2 — 2v3> 113 = — 7\*i — 5v 2 I 3\'3 



17. 



[T] B = 



1 


-1 


f 





1 





1 





-1 



20. 



(a) 



(d) 



2 

6 

12 

- Zx A + 3 



(e) 




21. The points are on the graph. 



24. 






1 





■ 


■ 








1 


■ 


- 











1 ■ 


- 











■ 


- 1 











■ 


- 



Exercise Set 9.1 (page 456) 



(a) 



y l=Cl e 5x -2c 2 e x 
y 2 = c { e 5x + c 2 s~ x 



(b) 



71 = 

72 = 



(a) 



(b) 



y\= -c 2 e 2x + c 3 e 3x 

y 2 = cie x -\ 2c 2 e 2x -c 3 e 3x 

y 3 = 2c 2 e 2x -c 3 e 2x 

y\=e — le 

y 2 = e x -2e 2x I 2e 3x 

y 3 = - 2€ 2x + 2e 3x 



7. y = cie 3x + c 2 e~ 2x 



9 - y = cie x + cie 2x + c-w 3x 
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l. 



(a) 
(b) 
(c) 
(d) 






-f 


_-l 


0_ 


"-1 


0" 





-1_ 


"1 0" 




0_ 




"0 0" 




1_ 





(a) 



(b) 



(c) 



(a) 



(b) 



(c) 



1 













1 















-1 


1 















■ 1 













1 




1 













1 













1 







• 1 





1 


















1 


1 


















-1 





1 















1 







1 







1 









7. Rectangle with vertices at (0, 0), (-3, 0), (0, 1), (-3, 1) 



9. 





"1 0" 




(a) 


4 1_ 






"1 -2" 


(b) 


_0 


1_ 



10. 



(a) 



(b) 



"1 


0" 





1 




3_ 


"6 


0" 


_0 


1_ 



12. 



(a) 



(b) 



(c) 



(d) 



2 

1 

1 

2 1 

1 

1 



3 

1 

1 4 

1 

'4 

1 



; expansion in the y-direction by a factor of 3, then expansion in the x-direction by a factor of 2 



; shear in the x-direction by a factor of 4, then shear in the y-direction by a factor of 2 



1 
-2 



; expansion in the y-direction by a factor of -2, then expansion in the x-direction by a 



factor of 4, then reflection about y = x 



1 

4 1 



1 
1 18 



1 -3 
1 



; shear in the x-direction by a factor of -3, then expansion in the y-direction by a 



factor of 18, then shear in the y-direction by a factor of 4 



14. 



(a) 



(b) 



1 
-5 



f2 -1 

6i/3 I 3 6 + 3^3 



17. 



(a) y =l* 



y = x 



(b) 



(0 ""I* 



(d) 



y — — 2x 



22. 





"1 





0" 


(a) 








1 







1 







"0 





f 


(b) 





1 







1 










"0 


1 


0" 


(c) 


1 
















1 



24. 



(a) A l = 1: 



(b) A l = 1: 



1 
_0_ 

'0" 
1 



;a 2 = -1: 



;a 2 =-i: 



(c) A l = 1: 



(d) 



(e) 



A=l: 



A=l: 



1 
_1 


;a 2 = 


-l: 


-1 
1 


"1" 








_0_ 








"0" 








_1_ 









(f) 



(0 an odd integer multiple of n) \ = — 1 : (1, 0), (0, 1) 
(0 an even integer multiple of n) A = 1 • (1, 0), (0, 1) 
(0 not an integer multiple of n) no real eigenvalues 
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'•--M* 



3. y = 2 + 5* - 3x : 



8. y = 4-.2x | 2x A ; if x = 12, then y = 30.4 ($30.4 thousand) 
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Exercise 98t-+9£5^fcisge-4i85$ 

(a) 



(a), (c^e^LCbf 



,„ _ i sin 2x sin 3x 
suij: I — - — I — - — + ■ 



sinwj: 
n 



3. 



3. 



5. 



(a) 

(a) "^ = 



(b) 1 
(b) 



9 


3 


-4" 


3 


-1 


1 
2 


-4 


1 


4 




2 





(a) " 



2 2 
1 



5 
2 

9 
2 



-3 



m 



A = 



8. 



E|sm 






1 


r 




2 


2 


1 





i 


2 




2 


1 


1 





2 


2 





(d) 



A = 



(e) 



A = 



/2 /2 -4/3 

/2 

-4/3 -/I 

11 0-5 

110 

0-1 2 

-5 2-1 



max value = 5 at ±(1, 0); mm va l ue = - 1 at +(0 ,1) 
(a) 



<U\ 1 U I l/lO Qt I 

to; max value = H dL 



1/20-6/10 ' /20 1 6/To 



• • 1 n-l/lO at 
5 mm value = H dl 



(c) 



max 



(d) 



max 



-1 



1 



l/20 + 6/T0 ' /20-6/l0 



value = ^ at -1- 



-1 



/20-6/T0 ' /20-6/T0 



' min value 



7 — l /To 



at 



1 



1 



1/20 -1 e/io ' /20-6/10 



value = ^ at ' 

3 -3 



t/20-2/l0 ' /20T2/T0 



» min value 



3 — l /To 



at 



/20 + 2/To ' /20-2/To 



7. (b) 



9. (a) 



11. Positive definite 

(a) 



(b) 



Negative definite 



Positive semidefinite 



(c) 



(d) 



Negative semidefinite 



Indefinite 



(e) 



Indefinite 



(f) 



13. (c) 



16. 



(a) 



1 


-1 


-1 


n 
-1 


n(n — 1 ) 

I 
n 

-1 


n(n — 1 ) 

-1 


n {n — 1 ) 
-1 


n {n — 1 ) 
-1 



A = 



«(« — 1) n{n — 1 ) n{n — 1 ) 
Positive semidefinite 



(b) 
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-1 


n{n — 1 ) 


-1 


n in — 1 ) 


1 



(a) 



*1 

*2 



1 


1 




/2 


f> 


>r 


1 


1 


72 


/2 


_ ^J 





W I 3^2 



(b) 



*1 
*2 



1 


2 


~7l~ 


2 


1 


72 



^1 I 672 



(c) 



*1 

*2 



1 


1 




/2 


/2 


~y\~ 


1 


1 


72 


f2 


~f 2 \ 





2 2 
^l-7 2 



(d) 



*1 
*2 



^17-4 {Vl I 4 



/34-8/17 |/34 I S/17 
1 -1 



/34-8/17 i/34 + 8/r7 



71 
72 



(1 + /l7)y? I (1 - /T7)y 2 2 



(a) 



2* 2 -3*y I Ay 1 



(b) 



x -xy 



(c) 



5xy 



(d) 



Ax 2 - 2y 2 



(e) 



5. 



(a) 



(b) 



(c) 



(d) 



(e) 



[* y] 
[* y] 

[* y] 

[* y] 
[* y] 



2 -I 



1 -2 



+ [-7 2] 



x 

y 



+ 7 = 



+ [5 3] 



-3 = 



"°f 


~ x~ 


J °. 


y 



-8 = 



"4 


0" 


' x~ 


-7: 


= 


_0 


-2_ 


y 






"0 


o" 


~ x~ 


., r^i 





1_ 


y 


+ [7 


-8] 


y 



-5 = 



9x f2 I Ay ,2 = 36, ellipse 
(a) 



(b) 



x f2 -\6y' 2 = 16, hyperbola 



(c) 



y' = 8jc', parabola 



(d) 



x f2 +y ,2 = 16, circle 



(e) 



18/ 2 - 12x' 2 = 419, hyperbola 



(f) 



y' = — ^' , parabola 



9- 2*" 2 +/' 2 = 6, ellipse 



11. 2*" 2 -3j/' 2 = 24, hyperbola 



15. Two intersecting lines, y — x and y — — x 

(a) 



(b) 



No graph 



(c) 



The graph is the single point (0, 0). 



(d) 



The graph is the line y = x. 



, ■, The graph consists of two parallel lines rrz x ' rrr^ ~ * • 

The graph is the single point (1,2). 
(f) 
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(a) 



x 2 -I 2y 2 -z 2 | Axy-5yz 



(b) 



3x 2 I 7z 2 I 2xy -3xz I Ayz 



(c) 
(d) 

(e) 
(f) 



xy + xz + yz 



r? + y 2 -z 2 



3z 2 I 3xz 



2z 2 + 2xz + y' 



3. 



(a) 



[* ^ z ] 



2 
2 



(b) 



[x y z] 



»-! 

3 1 

1 

-4 2 



o" 




5 


~*~ 


2 


y 


1 


z 


_ 





I [7 2] 



-3 = 



3~ 




2 


"*" 


2 


y 


7 


z 


_ 





I [-3 0] 



-4 = 



(c) 



[ x y z ] 



o I I 




2 2 






~x~ 


4 o 1 


y 


2 2 






z 


I I 




2 2 





1 = 



(d) [x y z] 



1 


0" 


~x~ 


1 





y 





-1 


z 



-7 = 



(e) 



[ x y z ] 



| 


" x~ 





y 


f 3 


z 



+ [0 -14 0] 



+ 9 = 





"o o r 


"jt" 


(f) [* y z] 


1 


7 




1 2 


z 



4- [2 -1 3] 



= 



9x f2 + 36/ 2 4 4z' 2 = 36, ellipsoid 
(a) 



(b) 



TOO 

6x' 4 3/ — 2z' =18, hyperboloid of one sheet 



(c) 



9 9 9 

3/ - 3;/ -z' =3, hyperboloid of two sheets 



(d) 



Ax' 2 + 9/ 2 -z' 2 = 0, elliptic cone 



(e) 



x l2 | 16/ 2 -16z' = 0, elliptic paraboloid 



,' 2 ?„' 2 , J 



(f) 



7* —1y I z = 0, hyperbolic paraboloid 



(g) 



x f2 | y' 2 | z' 2 = 25, sphere 



p2 . r/2 _ //2 



9 - *" I y - 2z" = - 1, hyperboloid of two sheets 



11. /' — _y " | z" = 0, hyperbolic paraboloid 
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1. Multiplications: mpn; additions: mp{n — 1) 



m = 5 « = 10 



« = 100 



h = 1000 



Solve Ax — h by Gauss-Jordan elimination 



Solve Ax — h by Gaussian elimination 



+:50 +:375 +: 383,250 +: 333,283,500 



x:65 x:430 x: 343,300 x: 334,333,000 



+:50 +:375 +: 383,250 +: 333,283,500 



x:65 x:430 x: 343,300 x: 334,333,000 



Find^ -1 by reducing [A\I] to [1 



A' 1 ] 



+:80 +:810 +: 980,100 +: 998,001,000 



x: 125 x: 1000 x: 1,000,000 x: 1,000,000,000 



Solved = b as x=J 4- 1 b 



+: 100 +: 900 +: 990,000 +: 999,000,000 



x: 150 x: 1100 x: 1,010,000 x: 1,001,000,000 



Find det(A) by row reduction 



+: 30 +: 285 +: 328,350 



+: 332,833,500 



x:44 x:339 x: 333,399 x: 333,333,999 



Solve Ax — h by Cramer's Rule 



+: 180 +: 3135 +: 33,163,350 +: 33,316,633 x 10 4 
x:264 x:3729 x: 33,673,399 x: 33 366 733 x 10 4 



,_■■_■■.,■ 







» = 5 


» = 10 


« = 100 


« = 1000 




Execution Time 

(sec) 


Execution Time 

(sec) 


Execution Time 

(sec) 


Execution Time 

(sec) 


Solve Ay = b by 
Gauss-Jordan elimination 


1.55 xlO -4 


1.05 xlO -3 


.878 


836 


Solve Ax = h by Gaussian 
elimination 


1.55 xlO -4 


1.05 xlO -3 


.878 


836 


Fii 


id ^ _1 by reducing [A\I] to 

ii- 1 ]" 


2.84 xlO -4 


2.41 xlO -3 


2.49 


2499 




SolveAx = bas x = J 4-l^ 


3.50 xlO -4 


2.65 xlO -3 


2.52 


2502 


Find det(j4) by row reduction 


1.03 xlO -4 


3.21 xlO -4 


.831 


833 



Solve j4x = 1j by Cramer's 
Rule 



» = 5 


» = 10 


« = 100 


« = 1000 


Execution Time 

(sec) 


Execution Time 

(sec) 


Execution Time 

(sec) 


Execution Time 

(sec) 


6.1Sxl0~ 4 


90.3 xlO -4 


83.9 


834 xlO 3 
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!• *j = 2> X2 = 1 



3. j:j = 3> 7:2 = — 1 



5. x\ = — 1. ^2 = 1' *3 = 



7. *i = — 1> *2 = 1' X3 = 



9- xi = -3'?-? = l'j:i = 2^4=l 



X 4 : 



11. 



(a) 



A = LU = 



2 0] 


1 I 


1 




2 


2 


2 1 











1 


2 1 1 















(b) 



(c) 



^ = ZiZ?t/ = 



A = L 2 U 2 = 



1 


0" 


"2 0] 


1 


1 


1 


1 


1 1 


1 J 



1* 






1 

2 
1 




1 
1 1 
1 1 1 



2 1 


-1" 





1 









13. 



(b) 



a b 




"1 0] 


c d 


— 


c - 1 




a 



a b 

ad — be 







a 



18. 



A = PLU = 



"1 0" 


"3 0] 


1 


2 


1 


3 1 J 



1 


n 


3 




1 


1 




2 





1 
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1. (a-d) 




x= — 2> y = —3 
(a) 



(b) 



x = 2, y = 1 



5. 2 + 3i 

(a) 



(b) 



(c) 



— 1 — 2i 



-2 + 9i 




= 3 + 1 




ft -=5= -6- 3d' 



(o) 



(« 



9. 



(a) 



?1Z2 = 34 3i» zj = - 9> zf = - 2j 



(b) 



?iz 2 = 26, zj = _ 20 I 48:' zj = - 5 - 12i 



(c) 



ziZ2 = -j--i'*? = f(-3 I 40^ 2 2 = -6-^i 



I- 



11. 76-88i 

12. 26-18; 

16. (2 + /2) + i(l-/2) 
18. -24; 



20. 



(a) 

(b) 
(c) 
(d) 



13 I 13i 

1+i 

7 + 9i 

6 + 2i 
-1 I 6i 



— 3 I 1 2 i — 33 — 22J 

i 

-6 I 6: -16-16! 

-11 I 19i 



-9-5i 



6i 1 +! 
-6-i 5-9! 



22 -7i 2| 10i 
-5-4; 6-8! 

9-! -1-! 



22. z=-l±i 

(a) 



W^^T 
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(a) 



(b) 



(c) 



2-7i 



-^-i5i 



-5i 



(d) 



(e) 



(f) 



-9 







5. -i 

(a) 



(b) 26 26 



7i 



(c) 



2 ' 2 



7 24 



625 625 



11. 1-/3 1 I ^3 ; 
4 4 



15. -1-2! 

(a) 



— 2— i* 
(b) 25 25 



18. 



(a) 



*>' 



? j 



(b) 



+ >' 






A 

S 



(c) 







23. *i*2 I yiyi 



27. 



(c) 



Yes, if ? ^0- 



30. 1 , . 1 . 

*1 = 2 + ! ' *2 = 2' ^3 = 2 ~ ! 



33. ti = (1 +0i»X2 = 2i 



35. 



(a) 



(b) 



i 


2 


_-l 


i 





r 


— i 


2i_ 



39. 



(a) 



(b) 



— i — 2 — 2i —l|i 
1 2 -i 

j i 1 

1 +! -i 1 

-7 + 6; 5 - i 1 + Ai 
1 + 2j - i 1 



Exercise Set 10.3 (page 539) 



1. o 

(a) 



tt/2 



(b) 



-t/2 



(c) 



tt/4 



(d) 



(e) 



(f) 



2W3 



-W4 



3. 



(a 



} 2 Hf) + '*(!)] 



4[cos7r + z sinTr] 



(b) 

(c) 5^2[co,(}) + i«ag)] 

(d) 12 Ht) + Ht)] 



(e) 



(f) 



3/2 



-'-f) l!Sm (-f 



cos (_|) liriaf-f) 



5. 1 



7. 







(b) 



(c) 



(d) 



(e) 




(f) 



J 


I) 




-t+\5^— 




\ V3 * r 


tf-i\^ 




1 - v 3 j 



10. 4 



j^cgj+i ■(!)]. y* 



co.|*Ui«(&. 



^' The roots are L [2 I 2 il _L |2 —2 n and the factorization is 



z 4 + 



S=(z 2 -2 5/4 zi2 3/2 )(z 2 | 2 5/4 z I 2 3/2 ). 



Re(z) = -3,Im(z) = 
(a) 



(b) 



Re(z) = -3.1m(z) = 



(c) 



Re(z) = 0,lm(z) = - ^2 



(d) 



Re(z) = -3,Im(z) = 



20. cos 29 = cos 9 — sin ft sin 29 = 2 sin cos ft 



3 -2 • ■ 2-3 

cos 39 — cos — 3 sin cos ft sin 39 = 3 sin 9 cos 9 — sin 9 



Exercise Set 10.4 (page 544) 



(a) 



(b) 



(c) 



(d) 



(e) 



(f) 



X -i, -2-i,4) 



3 + 2i, -l-2i, -3 i S, -i) 



— 1 — 2i, 2i, 2 — i, — 1) 



-3 ! 9i, 3-3i, — 3 — 6i, 12 I 3z) 



- 3 | 2i, 3, - 3 - 3i, i) 



1-S,3i,4, -5) 



5. 



(a) 



^ 



(b) 



2^3 



(c) 



/To 



(d) 



^37 



9. 3 

(a) 



(b) 



(c) 



2-27; 



-5-lOJ 



11. Not a vector space. Axiom 6 fails; that is, the set is not closed under scalar multiplication. (Multiply by i, for 
example.) 



13. 



ker T is all multiples of 



l + 3i 

l+i 

-2 



; nullity of T — 1 



17. C-3-2i)u I (3-;)v I (1 f 2i)w 

(a) 



(b) 



(c) 



(d) 



(2 + j)u + ( - 1 + i)v H ( - 1 - i)w 



Ou + Ov + Ow 



( - 5 - 4i)u t- (5 + 2i)v +(2 4- 4i)w 



19. (a), (b), (c) 

21. (b), (c) 

23. f-3g-3h = 

25. (a), (b) 

27. (_!_;_ 1 ) ; dimension = 1 

30 ' (I U " 2 ' ! ' °)' ( " 4 ' 4*' °' ! ) dffliension = 2 



Exercise Set 10.5 (page 551) 



2. 


-12 
(a) 





(b) 




2i 




(c) 




37 
(d) 


4. 


-4 | 5i 
(a) 





(b) 




4-4J 




(c) 




42 
(d) 


6. 


-9-5i 



8. No. Axiom 4 fails. 



io. j/To 

(a) 

2 
(b) 

ft 

(c) 


(d) 



12. 3i/10 

(a) 

(b) 



14. 



(a) 



(b) 



2 1/2 



16. 



(a) 



7^2 



(b) 



2/3 



20. (b) 



23. / ; _ ~ ; ' _ ; q 2; i ' i 2; 3; 2; — 2; j / _ ; 2; i_ i 



25. 



(a) vi = Hr--7r.-7r ' v 2 = 



ft* ft* fi> 



(b) vi = (i,0,0),v 2 = 0, 



-J=,J=^ , V3 = 
/2 /2 J 



j j 2i 



7i — 2i 



fi3' {53 



v 3 = 0, 



2z 7z \ 



^53'/53_ 



27. 



vi = 



n ' 1 — M „, [ 3; 2 1 I ; 
U, -7=, — — |»V2 =I 



^' ^ ' 



/l5' /l5' /l5 



36. /r. 3 1 

" = - yltvi H -7=V2 - "7=V3 
^6 ^2 



Exercise Set 10.6 (page 561) 



l. 



(a) 



-2i 4 5-i 
1+i 3-; 



(b) 



(c) 



(d) 



- 2i 4 - i 

1+i 5 + 7; 3 
-1-i i 1 

-7i 



3i 

AH a 2 \ 
&12 ail 



3. k = 3 + 5i,l = i,m = 2-4i 



(a),(b) 



5. 



(a) A -i 



3 
5 



4 
5 



(b) 



A~* = 



A t -¥ 



1 -1+i 
ft 2 



1 1-i 



(c) 



il"^ 






(d) 



i!"^ 



2 
1-i 



3-i 



2 fi 2{\5 

1 1 4-3; 



2 /3 2/15 

1 _L_ 5; 

2 /3 2/l5 



7. 



P = 



-1 I i 

1 



1-i " 




fe 


■P~ X AP = 


'3 0" 


2 


•> 


U 6 


fe\ 





9. 



P = 



1+i 


1+il 




/6 


^ 


•P _1 J 4f = 


"2 0" 


2 


1 


•> 


a 


/6 


^ J 





11. 



p = 



1 

1 — i q 1 — i 



f {3 

-2- -L 

/6 /3 





"1 


0" 


9 


5 










-2 



14. 



(a) 



i 
■i 



is one possibility. 



Supplementary Exercises (page 563) 



- 



— i 




"1" 


1 


? 










1 



is one possibility. 



5. A=1,co,;j 2 ( = £) 



Exercise Set 11.1 (page 572) 



1. y = 3x-4 

(a) ' 



(b) 



y = - 2x + 1 



x 2 I y 2 -4x-6y I 4 = or (* - 2) 2 I (y - 3) 2 = 9 
(a) 



(b) 



*^ I / I 2*-4y-20 = 0or(*+l) J -| (y - 2) J = 25 



* 2 I 2xy I .y 2 - 2* \-y = Q(a parabola) 



4. x + 2y+z = 

(a) 



(b) 



-x | y-2z+\ = 



(a) 



x 


J 


z 


o" 


*1 


Jl 


^1 


1 


*2 


^2 


^2 


1 


*3 


J3 


23 


1 



(b) 



= 



x + 2y-\-z = 0; -x I .y - 2z = 



6. x 2 + y 2 -\ z 2 -2x-4y-2z = -2or(*-l) 2 | (y-2) 2 I (z-l) 2 = 4 

(a) 



(b) 



x 2 + y 2 -\ z 2 -2x-2y = 3ox(x-\) 2 I (y-l) 2 +z 2 = 5 



10. 



y 


* 2 


X 1 


y\ 


A 


XI 1 


yi 


A 


*2 1 


73 


4 


*3 1 



= 



11. The equation of the line through the three collinear points 



12. = 



13. The equation of the plane through the four coplanar points 



Exercise Set 11.2 (page 576) 



1. j 255 , 97 , 158 
h ~ W h ~ ITT' h ~ ITT 



2. , 13 , 2 1 11 

h = — ' h = - j' h = — 



3. , 5 j 1 7 6 



' /l=^/ 2 = 0./ 3 = 0./4=^/j = ^/6 = ^ 



Exercise Set 11.3 (page 588) 



* i = 2> X2 = — ' maximum value of z — 4^- 



2. No feasible solutions 



3. Unbounded solution 



4. Invest $6000 in bond A and $4000 in bond B; the annual yield is $880. 

j- cup of milk, -^- ounces of corn flakes; minimum cost = ^f^- = 13.6 £ 
9 18 18 



Xl > and X2 > o are nonbinding; 2*1 | 3x 2 < 24 is binding 
(a) 

x i — *2 < v for v < —3 is binding and for v < — 6 yields the empty set. 
(b) 

*2 < v for v < 8 is binding and for v < Q yields the empty set. 
(c) 



7. 550 containers from company A and 300 containers from company B\ maximum shipping charges = $2110 

925 containers from company A and no containers from company B\ maximum shipping charges = $2312.50 
9. 0.4 pound of ingredient A and 2.4 pounds of ingredient B\ minimum cost = 24. 8 £ 

Exercise Set 11.4 (page 595) 

1. 700 



2. 5 

(a) 

4 
(b) 



4. Ox, ^i. units; sheep, ^- units 

(a) il Zi 

g 7 4 

First kind, -^- measures; second kind, -^- measures; third kind, -£- measures 

(b) Zj Zj Zj 



5. 



/ \ X] = ' A 2 — "l M'* — ■£?->5"-9" 

W tf — 2 

Exercise 7(b); gold, 30^- minae; brass, 9^- minae; tin, 14^- minae; iron, 5^- minae 
(b) Z Z Z Z 



5jc 17 I z - K = 



(b) 



x = -^-, y — -tM-> z — t^t - ' K = t where ? is an arbitrary number 
Take t = 131, so that * = 21, y = 14, z = 12, *T = 131- 



(c) 



Take t = 262, so that x = 42, y = 23, z = 24, K = 262- 



7. 



(a) 
(b) 



7 7 

Legitimate son, 577 j- staters; illegitimate son, 422-| staters 



Gold, 30— minae; brass, 9— minae; tin, 14— minae; iron, 5— minae 



(c) 



First person, 45; second person, 37 -jr', third person, 22-^- 



Exercise Set 11.5 (page 606) 



S(x) = -.12643(jc-.4) 3 -.20211(;c-.4) 2 | .92158(x - .4) \- .38942 
(a) 



(b) 



S(5) =.47943; error = 0% 



3. The cubic runout spline 

(a) 



(b) 



S(x) = 3x 2 -2x 2 I 5* + 1 



S(x) = 



-.00000042(x I 10) J 
.00000024(x) 3 -.0000126O) : 



I .000214(i I 10) I .99815, -10<x<0 

+ .000088(jc) I .99987, 0<x<10 

.00000004(;c-10) 3 -.0000054(jc-10) 2 -.000092(jc-10) I .99973, 10<x<20 

.00000022(;c-20) 3 -.0000066(jc-20) 2 -.000212(jc-20) I .99823, 20<*<30 



Maximum at (*, S(x)) = (3.93, 1.00004) 



.0000009(* | 10) J -.0000121(x | 10) J | .000282(x + 10) + .99815, -10<x<0 



S(x) = 



.000000900" 



-.0000093(x) J 



I .000070(x) 



+ .99987, 



0<x<10 



.0000004(*-10) J -.0000066(x-10r-.000087(x-10) I .99973, 10<x<20 
.0000004(*-20) 3 -.0000053(x-20) 2 -. 000207(^-20) I- .99823, 20<x<30 



Maximum at (*, S(x)) = (4.00, 1.00001) 




- 4x 3 + 3x 



0<*<0.5 



(c) 



4x 3 -12* 2 I 9x-l 0.5<x<l 

2-2x 0.5<x<l 
2-2x \<x<\.5 

The three data points are collinear. 



(b) 



4 10 
14 10 
14 1 


10 



ooo r 


" Mi 







M 2 







M 3 


6 
~ h 2 


14 1 


M n -2 




14 


M n -\ 





7h-1 - 2yi + y 2 

y\- 2y 2 + 73 

y 2 - 2y 2 + yA 
7h-3-2^ h -2+7h-1 



8. 



(b) 



2 10 
14 10 
14 1 






f 


" Mi 







M 2 







M 3 


6 

"a 2 


4 1 


M H _i 




112 


M» 





-k?\ - 71+72 

71- 2y 2 I- 73 

72- 2y 3 I 74 

7h-2-27h-1 I 7h 
7h-1- 7h + ^7h 



Exercise Set 11.6 (page 617) 



(a) 



rd) = 



~ A 


x® = 


".46" 


X ® = 


".454" 


x^ = 


".4546" 


*® = 


".45454" 


_.6_ 


5 


_.54_ 


j 


_.546_ 


i 


_.5454_ 


? 


.54 546 _ 



(b) 



P is regular, since all entries of P are positive; q = 



_5_ 

11 

_6_ 

11 



(a) «C0 = 



".7" 
.2 


x© = 


".23" 
.52 




".273" 
.396 


.1 




.25 




.331 



4. 



(b) 



P is regular, since all entries of P are positive: q = 



22 
72 

29. 
72 

2i 
72 



(a) 



(b) 



(c) 



_9_ 
17 
_8_ 
17 

26 
45 

li 
45 

J_ 
19 

19 

12 
19 



(a) 



(b) 



/* = 



pn 








1 1 



> « = 1 , 2, .... Thus, no integer power of P has all positive entries. 



as n increases, so 



for any X -CPD as n increases. 



, v The entries of the limiting vector 



are not all positive. 



6. 





"l 


1 


r 




2 


4 


4 


/> 2 = 


1 


1 


i 




4 


2 


4 




1 


1 


1 




4 


4 


2 



has all positive entries; q = 



10 
13 



54-7% i n region 1, 16-4% in region 2, and 2 9 -^% in region 3 
6 3 6 



Exercise Set 11.7 (page 627) 



(a) 



(b) 



(c) 












f 


1 





1 


1 


1 


1 





1 


















1 


1 





o" 














1 


1 








1 











1 














1 









10 10 
10 
10 111 
1 
1 
10 10 




(a) 




(b) 



(c) 



1-step: Pi — > P 2 
2-step: Pi — > P 4 — ► P 2 

Pl^P 3 ^P 2 
3-step: Pi — » P 2 — * -f 1 — » ^2 

1-step: Pi — ► P 4 
2-step: Pi — > P3 — > P 4 
3-step: Pi — > P 2 — ► Pi — > P4 
Pl^P 4 ^P 3 ^ P 4 



(a) 



1 











0" 





1 

















1 


1 











1 


2 


1 











1 


2 



(c) 



The jjfth entry is the number of family members who influence both the jth and jth family members. 



5. {P1.P2.P3) 

(a) 



(b) 



{P3.PA.P5) 



(c) 



{P 2 ,P A ,P6.PZ) and {PA.P5.P6} 



6. None 

(a) 



(b) 



{P3.PA.P5) 









1 


r 


1 














1 





1 





1 









Power of Pi = 5 
Power of P2 = 3 
PowerofP 3 = 4 
Power of P4 = 2 



8. First, A; second, B and £ (tie); fourth, C; fifth, D 



Exercise Set 11.8 (page 637) 



1. -5/8 

(a) 



[0 1 0] 



(b) 



(c) 



[10 0] 



LetA = 



1 1 
1 1 



, for example. 



(a) P =[0 1],1 = 



v = 3 



(b) P =[0 1 0],1 



,v = 2 



(c) p*=[0 1 ],«!* = 



,v = 2 



(d) p*=[0 1 0],1* = 



,v= -2 



(a) * 
P 



(b) * 
P 



5 3 
3 8 



2 1 

3 3 



^ 



,q = 



, v = 



27 



, V = 



70 



(c) P =[1 0],1 = 



v = 3 



«i) . 



(e) * 
P 



2 1 
5 5 



q = 



, v = 



19 







" l " 


3 


10] 


* 


13 


13 


13 J 


. q = 


12 






13 



V = 



-29 
13 



5. 



P = 









"ll" 


13 


7 " 


* 


20 


20 


20 _ 


1 = 


9 








20 



V = — 



3_ 
20 



Exercise Set 11.9 (page 646) 



l. 



(a) 





"6" 


(b) 


5 




6 
"78" 


(c) 


54 




79 



2. Use Corollary 1 1.9.4; all row sums are less than one. 

(a) 



(b) 



Use Corollary 1 1.9.5; all column sums are less than one. 



(c) Use Theorem 1 1.9.3, with x = 



"2" 




"1.9" 


1 


>Cx = 


.9 


1 




.9 



E 2 has all positive entries. 



Price of tomatoes, $120.00; price of corn, $100.00; price of lettuce, $106.67 



5. $1256 for the CE, $1448 for the EE, $1556 for the ME 



542 



(b) 503 



Exercise Set 11.10 (page 655) 



1. The second class; $15,000 



2. $223 



3. 1:1.90:3.02:4.24:5.00 



5. 



s/(gl 1 +g 2 1 + - + g fc i 1 



6. 1:2:3:— :» — 1 



Exercise Set 11.11 (page 662) 



(a) 



(b) 



"0 


1 


1 


0" 








1 


1 


















3 


3 







2 


2 




n 


n 


1 


1 






2 


2 















(c) 



_2 -1 -1 -2 

-1-1 

3 3 3 3 



t- 



(d) 



.866 1.366 .500 
-.500 .366 .366 




3 



2. 



(b) 



(0, 0, 0), (1, 0, 0), fly. 1, o\ and (1 1, oj 



(c) 



(0,0,0), (1,6,0), (1,1.6,0), (0,1,0) 



3. 



(a) 



"1 


0" 





-1 





1 



(b) 



(c) 



-1 











1 











1 




1 


0" 


1 








-1 



fes 



(a) 



Mi = 



\ o o" 




2 


,M 2 = 


o 1 





1 1 

2 2 







(b) 



cos (-45°) sin(-45°) 
M 4 = 1 

-sin(-45°) cos( -45°) 

P ! -.= M 5 M 4 M 3 (MiP I M 2 ) 



M 3 = 



M 5 = 



1 











cos 20° 


-sin 20° 





sin20° 


cos 20° 



-1 


0" 











1 



(a) 



M\ = 



3 0" 




.5 


,M 2 = 


1 





1 

cos 45° -sin45° 
sin45° cos 45° 



(b) 



cos(-45°) -sin(-45°) 
M 5 = sin(-45°) cos(-45°) , M 6 = 

1 

P'--=M 7 (M 5 M 4 (M 2 M { P | M 3 ) + M 6 ) 



M 3 = 







1 1 



1 1 ... 1 
0-0 
■-■ 



,M 7 = 





" cos 35° 





sin 35° 


,M A = 





1 







-sin 35° 





cos 35° 


2 0" 








1 








1 









cos ,3 sin J 
R\= 10 

— sin.^ cos J 

cos/3 —sin ,3 

R 5 = 1 
sin J cos J 



*2 = 



cos a —sin a 

sina cos a 

1 



*3 = 



cos# sinfl 

1 
— sin ft costf 



,R 4 = 



cosct sinct 

— sin a- cos a- 

1 



(a) 



M = 



(b) 



1 








*0 





1 





yo 








1 


ZQ 











l 



l 








-5~ 





1 





9 








1 


-3 











1 



Exercise Set 11.12 (page 673) 



l. 



(a) 







t\ 




tl 




t3 




t A 









(b) 



t = 



(c) 



n 


1 


1 




4 


4 


1 








4 






1 





n 


4 






n 


1 


l 




4 


4 












r -, 







1 


t\ 




l 


4 


ti 


1 


2 


1 


t3 


-r 





4 


t A 




1 









2 


_ 









&= 



"o" 




V 

8 




" 3 " 
16 




" 7 " 
32 




"is" 

64 


1 

2 

1 


t (2)_ 

5 K 


5 
8 
1 
3 


,& = 


11 
16 
3 
16 


,^ = 


23 
32 
7 
32 


,t^ = 


47 
64 
15 
64 


2 




5 
3 




11 
16 




23 
32 




47 
64 



t®-t= 



J_ 

64 
J_ 
64 
J_ 
64 
±_ 
64 



(d) 



far ij, 4.5%; far ( 2 ,- 1.8% 



2. 1 

2 




13.IBJL22ilJ_2116jlO 
16 16 16 16 16 16 16 16 16 



Exercise Set 11.13 (page 685) 



* /31 27 
(c) 3 "" 1 22' 22 



(a) x ^ = (1.40000, 1.20000) 
xf = (1.41000, 1.23000) 
x^ = (1.40900, 1.22700) 
x^ = (1.40910, 1.22730) 
xf = (1.40909, 1.22727) 
xf = (1.40909, 1.22727) 



Same as part (a) 



(b) 



(c) x f = (9.55000, 25.65000) 
x^ = (.59500, -1.21500) 
xf = (1.49050, 1.47150) 
xf = (1.40095, 1.20285) 
xf = (1.40991, 1.22972) 
xf = (1.40901, 1.22703) 



• X ; = (l, l>x* = (2,0)> X3 * = (1, 1) 

^7 + ^S H-Jfp = 13.00 

*4 + *5 + *6 = 15.00 
*l+*2 + *3 = 8.00 

.82843(* d + x%) + .53579*9 = 14.79 
1.41421(* 3 I x 5 I x 7 ) = 14.31 

.82843(* 2 +* 4 ) + .58579*1= 3.81 
X3 + x 6 + x 9 = 18.00 
x 2 + x 5 + xg = 12.00 
*l +*4 + *7 = 6.00 

.82843(^2 +x 6 ) + .58579* 3 = 10.51 
1.41421(^1 I *5 I x 9 ) = 16.13 

.82843(* 4 + * 8 ) + .5S579*7 = 7.04 



8. x 7 + *8+*9 = 13.00 

X4 + X5+x$ = 15.00 

*l+*2+*3 = 8.00 

.04289(*3+* 5 + x 7 )4 .75000(* 6 I x 8 ) I .61396*9 = 14.79 

.91421(x3 + * 5 + * 7 ) + .25000(* 2 I x 4 + x 6 + x%) = 14.31 

.04289(*3+* 5 + * 7 )4 .75000(^2 I x 4 ) I .61396*1= 3.81 

*3 + *6 + *9 = 18.00 

*2 + *;5+*8 = 12.00 

*1 +*4 + *7 = 6.00 

.04289(*i + x 5 + x 9 ) -\ .75000(^2 I x 6 ) I .61396*3 = 10.51 

.91421(*i+* 5 + *9) + .25000(*2 I *4 I x 6 I * 8 ) = 16.13 

.04289(*i I * 5 I xg) I .75000(* 4 I x 8 ) I .61396* 7 = 7.04 

Exercise Set 11.14 (page 702) 



12. 

25 



1 

1 



n 



d E {S) = ln(4) / In (II] = 1.888.. 



, / = 1, 2, 3, 4, where the four values of 



"*l" 


are 


"0" 
_0_ 


j 


"13" 
25 

_ _ 


j 


" " 
13 
25 


, and 


"13" 
25 
13 
25 



2. s » .47; ^jj(S) « ln(4) / ln(l / .47) = 1.8 . ... Rotation angles: 0° (upper left); 90° (upper right); 180° (lower left); 
180° (lower right) 



3. 



(a) 



.-1; 



all rotation angles are 0°; 



11. 



tfjj(£)=ln(7)/ln(3) = 1.771 



111. 



This set is a fractal. 



(b) m B = L 

1. I 



11. 



111. 



all rotation angles are 180°; 



rfjj(S)=lnC3)/ln(2) = 1.584 



This set is a fractal. 



1. £ 

rotation angles: 90° (top); 180° (lower left); 180° (lower right); 
ii. 

djjOS)=ln(3)/ln(2) = 1.584.... 
iii. 

This set is a fractal. 

1. * 

rotation angles: 90° (upper left); 180° (upper right); 180°(lower right); 
ii. 

d J j(S)=ln(3)/ln(2) = 1.584.... 
iii. 

This set is a fractal. 

4. s = .85O9...,0= -2.69°.. 

5. (0.766, 0.996) rounded to three decimal places 

djjOS)=ln(16)/ln(4)=2 
ln(4)/ln[|l = 4.818... 

8. tf #(£) = ln(3) / ln(2) = 3; the cube is not a fractal. 
9 - k = 20; s = i; tf H (£) = In(20) / In(3) = 2.726 .. .; the set is a fractal 
10. 




1 1 1 ; r i . i ! ,..-\ 



First iterate 



Secoiul iterpie 



■ ■■■ ■■■■ Thinl iterate 

Fourth iionne 

d s (S$ = ln(2)y 1b{3) = 0.6309 , . , 



11. 



Area of s = l; area of ^ = | = 0.888...; area of s 2 = \^] = 0.790...; area of £ 3 = ||.j = 0.702...; area of 
5-4= (f) =0.624... 



Exercise Set 11.15 (page 716) 

1. n(250)=750, n(25) = 50, n(125) = 250, n(30) = 60, n(10) = 30, n(50) = 150, n(3750)=7500, 
11(6) = 12, 11(5) = 10 



One 1-cycle: {(0, 0)} ; one 3-cycle: |(|, o), (|, |), (o, |J 

.woWfi 0), (ill (|,0), (§,§)W(o,§), (§,f), Ml (i | U; 



twol2-cycles:{(0,I),(I|), 



n II 2\ 13 5\ 12 U 13 4\ fl 5\ l Q 5\ 15 ±\ 13 U l± 5\ 13 2\ 15 V 
6' 6 r \6- 6 r W SI' \6' 6 J' I ' 6 J' U' 6T \6' 6 J' U' 6 J' U' 6 J' U' 6 



- { ({• »} ({• {} (§• !)• (I ■ §)■ ({• !)• (!• i> (I ■ "} (I ■ I )■ (!■ !)• ({• f } (I ■ !)• (!• i, f • 

n(6) = i2 



3,7, 10,2, 12, 14, 11, 10,6, 1,7,8,0,8,8, 1,9, 10,4, 14,3,2,5,7, 12,4, 1,5,6, 11,2, 13,0, 13, 13, 11,9,5, 
(a) 14,4,3,7,... 



(c) 



(5,5), (10,15), (4,19), (2,0), (2,2), (4,6), (10,16), (5,0), (5,5), 



The first five iterates of (^ o) are (^ ^j, (^ ^j, (^ JL-), (^-, ^-), and 

' 34 55 \ 
101 ' 101 J' 



(b) 



The matrices of Anosov automorphisms are 



"3 2" 


and 


"5 7" 


|_1 lj 




[2 3 J 



(c) 



The transformation affects a rotation of S through 90° in the clockwise direction. 



«0. I) 



(I. I) 



10. 1) (1/2, J > (Mi 



«l. 1/2) 



II \. 

1 


-■-EH-M 


iir / yyy 

//tV' / II' 



(0. 0) 



In region I: 



(1-0) 



(0.0) (1/2,0) (LOJ 



'a' 




r°i 




'a' 




\ ° 


_b_ 


— 


_0_ 


; in region II: 


_b_ 


— 


_-l 



; in region III: 



'a' 




[-11 




'a' 




[-1 


b_ 


— 


_-!_ 


; in region IV: 


_b_ 


— 


-2 



f— , — j and f — — j form one 2-cycle, and f — , — 1 and f — , — j form another 2-cycle. 



Exercise Set 11.16 (page 729) 



1. GIYUOKEVBH 

(a) 



SFANEFZWJH 



(b) 



2. 



(a) 



A~ l = 



12 7 
23 15 



Not invertible 



(b) 



(c) 



A l = 



1 19 
23 24 



Not invertible 



(d) 



Not invertible 



(e) 



(f) 



A~ l = 



15 12 
21 5 



3. WE LOVE MATH 



Deciphering matrix = 



7 15 
6 5 



; enciphering matrix = 



7 5 
2 15 



5. THEY SPLIT THE ATOM 



6. / HAVE COME TO BURY CAESAR 



7. 



010100001 



(a) 



(b) 



"0 


1 


f 


1 


1 


1 


1 





1 



8. A is invertible modulo 29 if and only if det(j4) * 0(mod 29)- 



Exercise Set 11.17 (page 741) 



l fi\ n+1 
t3! " = 4 + ("2) (^o- c o) 

i n\ M+1 

^M = ^ — ("2 1 (flO-co) 



\ 



<Xy 



1 

4 



) « = 1,2,... b n — » —J as« 



x 



3. 



2 1 

2h+1 = j + „ (2a - Aq - 4c o) 



6(4) 

1 
5(4) 



J 0(4) 
c 2h+1 =° 



) » = 0, 1,2, 



5 1 *1 



*:* = * 



) »=1, 2, 



C2 " = t2" ~ Tw)"^ ~ &0 " 4C0) 



6(4) 



4. 



Eigenvalues: Ai = 1> A? = — > eigenvectors: ei = 



<?2 = 



1 
-1 



5. 12 generations; .006% 



>) 



x w = 



1+ 



>2h+3 



■>2h+1 



>2h+1 
1 



22h+1 
1 



22h+1 






>2m+3 



^m+1 



(-3-/5)(l,/5p' + (-3 + /5)(l-/5) 



M + l 









1 

2 







; x^ - 



n 









1 

2 







as tf 



x 



8. 



"l 








o" 



































1 



Exercise Set 11.18 (page 751) 



l. 



(a) Al = |, X! = 



(b) 



(c) 



"100" 

. 50 _ 


x <7) = 


"175" 
. 50 _ 


x® = 


"250" 
. 88 _ 




"382" 
125 


x^ = 


"570" 
191 



rd) = 



«<® = Z*® = 



857 
285 



,*®>-Al«<3 = 



855 
287 



7. 2.375 



8. 1.49611 



Exercise Set 11.19 (page 760) 



(a) 



Yield = 33^-% of population; xi 



1 

1 
3 

J_ 
18 



(b) 



Yield = 45. 8% of population; Xl = 



; harvest 57.9% of youngest age class 



2. 



*1 



1.000 




2.090 


.845 




.845 


.824 




.824 


.795 




.795 


.755 




.755 


.699 
.626 


, Lx\ = 


.699 
.626 


.532 




.532 







.418 



























1.089 + .418 
7.584 



= .199 



4. hj= (R- 1) / (ajhihx-hj-i + -4 a^i^^M-l) 



5. _ ai +a 2 b\ H h (flj-i&i&2"^J-2) - 1 

' lJ Ajiii2"*J-l H hay-ii 1^2-^-2 



Exercise Set 11.20 (page 767) 



1 - 2 4 

' ^r + 4 cos t + cos 2^ + ^- cos 3^ 

3 y 



+ — T" COS~=-^ + "^7 COS -=-£ I — r- COS -=-£ + — r- COS -=-£ 



3 * 2 



T 2 , 



7 3 J 



^U^ + Iab^ + lsin^ + Iffln^ 



3 - - + ^-sin^-Tr-cos2i--|— cos 4^ 
t 2 3ir 15w 



-'- — — cos i - ^r-=- cos 2t - -A=- cos It - ■ 



1 



t 2 1-3 



3-5 



5-7 



(2»-l)(2»+ 1) 



cosnt 



T ST ( 1 rrtC 2rt 1 rAC 6nt 1 rAC lOirT 

~r — — ^r ~tt cos — =- 4- — r- cos — =- 4- — — cos 



4 J ,2 



w^2 



10" 



1 __ . 2«7rf 

, +-+ r-COS^=^ 

1 (2«) 2 r 



Exercise Set 11.21 (page 775) 



(a) 



Yes; v = I Vl+ | V2 + | V3 



(b) No; v= | Vll i V2 _i V3 



(c) Yes; v = |vi + |v 2 H 0v 3 



(d) 



Ypq- 4 6 5 

15 M 15 2 15 3 



2. m — number of trainees — l,n — number of vertex points — 7,k — number of boundary vertex points = 5; Equation (7) is 
7 = 2(7) -2-5- 



3- w= Mv + b = M (c\\\ I c 2 v 2 I C3V3) I (ci+C2 + C3)b 
= ciCM"vi + b) + c 2 (Mv 2 I b) I C3CMV3 I b) 

= C 1 W 2 + C 2 W 2 + C3W3 



4. 



(a) 




(b) 



v i v 



v 



(a) 



(b) 



(c) 



M = 



M = 



M = 



(d) M = 



1 2 
1_ 


,b 


= 


1 
_2_ 




3 -1" 
1 1_ 


,b = 


"0" 
_1_ 


1 0" 
1_ 


,b = 


2" 
_-3_ 


U 


,b = 


r 

2 


2 








-1_ 



7. Two of the coefficients are zero 

(a) 



At least one of the coefficients is zero 



(b) 



None of the coefficients are zero 



(c) 



8. Ill 

(a) I V1 + 3 V2 + I V3 



(b) 



8/3 
2 
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